[CV] Object detection

Fundamental image recognition tasks[Kirillov et al., CVPR 2019]

- Semantic segmentation [instance recognition : X | semantic recognition : O]

- Instance segmentation [instance recognition : O | semantic recognition : X]

- Panoptic segmentation [instance recognition : O | semantic recognition : O]

Further topic

- Object detection [classification + Box localization]

- OCR

Traditional method (hand crafted techniques)

- Gradient method

- Average gradient : edge detection, gradient based detector(e.g. HOG)

- max(+ or -) SVM weight

- R-HOG description or R-HOG SVM weight

- Selective search [UIjling et al., IJCV 2013]

R-CNN[Girshick et al., CVPR 2014]

- Region with CNN features

- Extract region (such as selective search) and warpping -> CNN features -> Classifier

- Traditional method for preprocessing : performance limitation

- Model prediction for every region proposal : heavy computation

Fast R-CNN[Girshick et al., CVPR 2014]

- Conv feature map (independent of original image size, extractor is not needed)

- RoI(Region of Interest) feature extraction and resample

- Region proposal is hand-crafted algorithm -> limited performance

Faster R-CNN[Ren etal., NeurIPS 2015]

- IoU = Intersection / Union (the higher, the better)

- Region proposal

- Anchor boxes(A set of pre-defined bounding boses)

- IoU between Ground truth proposed anchor box is the criteria of positive and negative

- Region Proposal Network(RPN)

- Non-Maximum Suppression (NMS)

One-stage(single-stage) detector

- [Ndonhon et al., offshore Technology Conference 2019]

- No explicit RoI pooling

- You only look once (YOLO)[Redmon et al ., CVPR 2016]

- Single Shot MultiBox Detector (SSD)[Liu et al., ECCV 2016]

Two-stage vs. One-stage

- Focal loss

- Class imbalance problem on Single-stage detector (# of negative anchor boxes >> # of positive boxes)

- Improved cross entropy loss

Detector with Transformer

- ViT by Google

- DeiT by Facebook

- DETR[Carion et al., ECCV 2020]

- objective query : Learned positional encodings for querying

###

피어세션

https://www.notion.so/Bilinear-resize-convolution-c893a921898f4987aded25f85674c730

저작자표시 비영리 변경금지

'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글

[Lv2 P-Stage] Object Detection Overview (0)	2021.09.27
[CV] Further topics of segmentation (0)	2021.09.19
[CV] Semantic segmentation (0)	2021.09.09
[CV] Image Classification 2 (0)	2021.09.08
[CV] 0. Computer Vision OT & 1. Image Classification 1 & 2. Annotation data efficient learning (0)	2021.09.07

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Steady Learning for Deep Learning

[CV] Object detection

'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

[CV] Object detection

'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역