[CV] 0. Computer Vision OT & 1. Image Classification 1 & 2. Annotation data efficient learning

CV란?

Computer Graphics는 익숙하다. Computer Vision이란 Computer Graphics의 반대되는 개념.

즉 CV는 Inverse Computer Graphics.

AI, 인공지능이란

인공지능의 가장 좋은 reference는 인간.

인간은 어릴때 오감을 활용한 지각능력을 발달시키는 것을 지능의 시발점으로 삼는다.

시각을 통한 인지와 닮아있다.

실제 사물 -> Computer Vision -> Representation -> Rendering -> Computer Graphic

CV에서 ML 과 DL 의 가장 큰 차이점

Feature extraction을 Classifier가 동시에 함으로 인간의 선입견이나 지각적 한계를 뛰어넘을 수 있다.

Image Classification

classifier : 입력데이터를 class 공간에 mapping 하는 문제.

-> 세상 모든 데이터를 알고있다면 k-NN 문제로 해결할 수 있다는 직관을 얻을 수 있음.

-> 시스템 복잡도, 유사도 함수 정의 등의 문제가 생긴다.

- CNN의 등장

Fully connected layer -> Global template extraction -> Crop data 등의 test phase에서 한계점이 두드러짐

-> Locally connected layer, local feature extraction의 필요성

-> 적은 parameter를 이용한 sliding window : 영상에 굉장히 적합

CNN Architectures for Image Classification

0. CNN [Yan Lecun et al., IEEE 1999]

1. AlexNet (Krizhevsky et al., NIPS 2012)

- layer가 깊어지고 파라미터가 많아짐

- ReLU, DropOut 사용

- Local Response Normalization(LRN) is deprecated.

2. VGGNet

- AlexNet 보다 깊은 네트워크 : 큰 Receptive Field, None Linearity

- 3x3 conv filter and 2x2 pooling

- 높은 일반화 성능을 보여주기 시작

- 224x224x3 의 RGB 데이터 사용, 채널별 mean subtraction 사용

Learning Representation od Dataset

sample 데이터세트는 인간이 축적한 것 -> 인간의 시각에서 bias 되어있다.

Brightness, Rotate, Crop 등등 Augmentation을 이용하면 smaple dataspace의 빈공간을 채워줌으로

실재 distribution에 가깝게 만들 수 있다.

Affine transformation

각 변의 변환 전과 변환 후의 길이 비가 유지되면서 변환전의 평행관계가 유지되는 변환.

shear transformation 이라고도 부름

OpenCV에서는 3개의 점이 변환되는 좌표를 통해서 더 직관적으로 transform matrix를 구할 수 있음.

M = cv2.getAffineTransform(pt1,pt2)

Leveraging Pre-trained Information

Transfer Learning

- Approach 1

pre-trained body (freeze weight) + fully connected layer (update weigt) => new task

- Approach 2 [Oquab et al., CVPR 2015]

pre-trained body (low learning rate) + fully connected layer (high learning rate) => new task

Knowledge Distillation (Teacher-student learning) [Hinton et al., NIPS deep learning workshop 2015]

- Approach 1

update the grad of KL div between teacher model(pre-trained, freeze) and student model(not trained)

- Approach 2

update the grad of

1. distilation loss(KLdiv) between each of softmax with temperature(soft prediction)

2. student loss(CEloss) between ground truth and normal softmax(hard prediction)

Leveraging unlabeled dataset for training

Semi-supervised Learnging with pseudo labeling [Lee, ICML workshop 2013]

Self-training

self-training with noisy student [Xie et al., CVPR 2020]

- Iteratively training noisy student network using teacher network

저작자표시 비영리 변경금지 (새창열림)

'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글

[CV] Semantic segmentation (0)	2021.09.09
[CV] Image Classification 2 (0)	2021.09.08
[P-Stage] 마스크 데이터 분류 대회 리포트 - 2 (0)	2021.09.03
[P-Stage] 마스크 데이터 분류 대회 리포트 - 1 (0)	2021.09.03
[ BoostCamp ] Day-18 학습로그( PyTorch pretrained model ) (0)	2021.08.20

Steady Learning for Deep Learning

[CV] 0. Computer Vision OT & 1. Image Classification 1 & 2. Annotation data efficient learning

CV란?

AI, 인공지능이란

Image Classification

CNN Architectures for Image Classification

Learning Representation od Dataset

Leveraging Pre-trained Information

Leveraging unlabeled dataset for training

Self-training

'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글

댓글

티스토리툴바

[CV] 0. Computer Vision OT & 1. Image Classification 1 & 2. Annotation data efficient learning

CV란?

AI, 인공지능이란

Image Classification

CNN Architectures for Image Classification

Learning Representation od Dataset

Leveraging Pre-trained Information

Leveraging unlabeled dataset for training

Self-training

'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글

관련글

댓글

티스토리툴바