'분류 전체보기' 카테고리의 글 목록 (11 Page)

ValueError: transformers.__spec__ is None

2022.02.21· 사소한 Tip . 오류 해결법

transformer 재설치 pytorch version이랑 뭔가 안맞는듯 아래 참고했을 떄 transformers 4.10.x 부터 해결되는 것 같아서 기존 4.5.1버전을 uninstall 로 삭제하고 재설치함 +) https://github.com/huggingface/transformers/issues/12904 transformers.__spec__ returning None. Causing downstream import errors · Issue #12904 · huggingface/transformers Environment info transformers version: Tried on 4.6.1(current default kaggle version)/4.8.1/4.8.2 and 4.9..

RTX 3090 Pytorch

2022.02.21· 사소한 Tip . 오류 해결법

https://ssaru.github.io/2021/05/05/20210505-til_install_rtx3090_supported_pytorch/ (TIL) RTX 3090을 지원하는 PyTorch 버전설치 2021.05.05 현재 RTX3090은 CUDA11 이상을 지원하는 딥러닝 프레임워크에 버전에서만 사용할 수 있습니다. 하지만 단순하게 pip install torch==1.7.1 torchvision==0.8.2 형태로 설치하면 CUDA error: no kernel image is ava ssaru.github.io pip install torch==[버전] 이런식으로 설치하면 CUDA error: no kernel image is available for execution on the dev..

AttributeError: module 'distutils' has no attribute 'version'

2022.02.21· 사소한 Tip . 오류 해결법

pip install setuptools==59.5.0 이유는 모르겠지만 이게 도움이 되는 것 같다. +) 참고자료 : https://stackoverflow.com/questions/70520120/attributeerror-module-setuptools-distutils-has-no-attribute-version AttributeError: module 'setuptools._distutils' has no attribute 'version' I was trying to train a model using tensorboard. While executing, I got this error: $ python train.py Traceback (most recent call last): File "t..

Read Like Humans: Autonomous, Bidirectional and Iterative LanguageModeling for Scene Text Recognition

2022.02.17· Paper Reading/Scene Text Recognition(OCR)

https://arxiv.org/abs/2103.06495 Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition Linguistic knowledge is of great benefit to scene text recognition. However, how to effectively model linguistic rules in end-to-end deep networks remains a research challenge. In this paper, we argue that the limited capacity of language models comes from arxiv..

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

2022.02.14· Paper Reading/Document Information Extratction

PAPER : https://arxiv.org/abs/2106.10598 TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition A table arranging data in rows and columns is a very effective data structure, which has been widely used in business and scientific research. Considering large-scale tabular data in online and offline documents, automatic table recognition has attracted i arxiv.org GITHUB: https..

DocFormer End-to-End Transformer for Document Understanding

2022.02.11· Paper Reading/Transformer based Embedding Model

PAPER DocFormer: End-to-End Transformer for Document Understanding We present DocFormer -- a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU). VDU is a challenging problem which aims to understand documents in their varied formats (forms, receipts etc.) and layouts. In additio arxiv.org GitHub GitHub - shabie/docformer: Implementation of DocFormer: E..

PIL Image open시, png 파일 투명도 손실(검은 이미지) (png pil open, non-transparent, black images)

2022.02.07· 머신러닝/Computer Vision

PIL Image.open함수로 png를 열었는데 아래와 같이 배경그림이 다 검게 변해버렸다. 알고보니, open 과정에서 RGBA 옵션으로 투명도를 적용해줘야 정상적으로 이미지가 열린다. When, I opened the PNG Files by (PIL.open()), the background is returned as black images like below. => Solution Image.open($IMAGE_NAME).convert("RGBA")

마크다운/ HTML 표 생성

2022.02.04· 사소한 Tip . 오류 해결법

https://www.tablesgenerator.com/markdown_tables

PIL Filter sample

2022.02.03· 머신러닝/Computer Vision

from PIL import Image, ImageDraw, ImageFont, ImageOps, ImageFilter def _random_filter(bg_image): filter_dict = { '0':ImageFilter.BLUR, '1':ImageFilter.CONTOUR, '2':ImageFilter.DETAIL, '3':ImageFilter.EDGE_ENHANCE, '4':ImageFilter.EDGE_ENHANCE_MORE, '5':ImageFilter.EMBOSS, '6':ImageFilter.FIND_EDGES, '7':ImageFilter.SMOOTH, '8':ImageFilter.SMOOTH_MORE, '9':ImageFilter.SHARPEN, '10':None } for num..

[OpenCV] cv2로 bounding box/ polygon 그리기 / Drawing polygon by cv2 (cv2.polylines)

2022.01.26· 머신러닝/Computer Vision

file("gt_%d.txt") : GT bounding box 정보를 담고 있음, 각 polygon 좌표값은 tab('\t')으로 구분했다고 가정 [x1 y1 x2 y2 x3 y3 x4 y4 label] import cv2; import numpy as np input_id = 1 # Reading polygon txt file file = './gt_%d.txt'%input_id f = open(file,'r') lines = f.readlines() # Reading Image image_file = './images/0/%d.jpg'%input_id image = cv2.imread(image_file) # Draw for line in lines: polygon = line.split('\t')..

PIL Image rotate, 이미지 회전 시 꼭지점 좌표 계산 / How to calculate the vertices coordinate(or polygon points) after image rotation

2022.01.25· 머신러닝/Computer Vision

이미지 회전 후, 꼭지점 좌표 계산 방법 [ How to calculate the vertices coordinate(or polygon points) after image rotation] -> cv2나 albumentation, pil 등 다른 간단한 오픈 소스 또는 함수가 있을 것 같은데 찾질 못했습니다. 혹시나 아시는 분이 있다면 댓글 달아주시면 감사하겠습니다. (1) 이미지 처럼 Background image가 존재하고, A와 같은 small box들을 여러 개 합성했을 때, (2) 와 같은 이미지가 최종적으로 생성된다고 보자. 이 최종 이미지 (2)를 PIL rotate함수로 랜덤하게 회전하게끔 수정하였다. 이때 (3)번과 같은 이미지가 생성되는데, 이때, A들의 꼭지점 좌표(polygon)을..

[PIL] 이미지 축소, 확대(Image resize)

2022.01.24· 머신러닝/Computer Vision

단순히 PIL Image를 resize할 때는 특정 값 또는 비율로 조절하는 것이 쉽다. pilimage.resize((x,y)) 다만, 특정 비율로 조절하고 싶을 때 예를 들어 특정 y값을 갖고 있지만 일정한 가로 세로 비율을 유지한 채 조절하고 싶을 때는 다음 함수를 사용한다. pilimage.thumbnail((x,y)) # x, y는 각각 resize해도 되는 최댓값 이러면 정해진 비율대로 조절됨. 추가로, thumbnail함수를 사용할 때 아래와 같이 다른 value값에 할당하고 싶을 때는 꼭 copy함수를 쓰자 안그러면 thumbnail함수를 사용했는데 정작 할당된 다른 변수 값은 None이 나올 수 있음 # 이렇게 하면 에러 발생함 new_image = pilimage.thumbnail((x..

[PIL] convert RGBA Numpy to PIL Image (TypeError: Cannot handle this data type: (1, 1, 4), <f4)

2022.01.24· 머신러닝/Computer Vision

Image.fromarray new_image = Image.fromarray(before_image.astype(np.uint8)) +) 이 때 numpy dtype이 uint8형태인지 꼭 확인해야 함. 안그러면 아래와 같은 에러 발생 아닐 경우에는 상단처럼 astype으로 형변환 진행 후 PIL Image로 변환 TypeError: Cannot handle this data type: (1, 1, 4),

[PIL] 함수 모음 - 지속적으로 작성 진행 중

2022.01.21· 머신러닝/Computer Vision

PIL.Image.alpha_composite(im, dest=(0, 0), source=(0, 0)) : Image 1에 대해서 Image 2로 알파 합성 진행 Parameters : - im1 : 이 위에 합성할 이미지 - (optional) dest에서 왼쪽 상단 corner의 좌표 - (optional) src에서 왼쪽 상단 corner의 좌표 Return : Image Object 두 이미지 모두 RGBA 형태여야 하고 같은 사이즈 여야 함. PIL.Image.blend(im1, im2, alpha) : 특정 값(알파 상수)를 사용해 두 input image를 적절히 보간하여 새 이미지 합성 Parameters : - im1 : 첫 번째 이미지 - im2 : 두 번째 이미지(첫번째, 두번째 같은..

[Jupyter Notebook, HTML] 주피터 노트북에서 HTML 읽고 화면 출력하기 / How to display HTML contents in Jupyter Notebook

2022.01.14· 사소한 Tip . 오류 해결법

from IPython.display import IFrame IFrame(src='./nice.html', width=700, height=600) html 파일 로딩 후, IFrame 으로 display 출처 How to embed HTML into IPython output? Is it possible to embed rendered HTML output into IPython output? One way is to use from IPython.core.display import HTML HTML('link') or (IPython stackoverflow.com

티스토리툴바