Fundamental image recognition tasks[Kirillov et al., CVPR 2019]
- Semantic segmentation [instance recognition : X | semantic recognition : O]
- Instance segmentation [instance recognition : O | semantic recognition : X]
- Panoptic segmentation [instance recognition : O | semantic recognition : O]
Further topic
- Object detection [classification + Box localization]
- OCR
Traditional method (hand crafted techniques)
- Gradient method
- Average gradient : edge detection, gradient based detector(e.g. HOG)
- max(+ or -) SVM weight
- R-HOG description or R-HOG SVM weight
- Selective search [UIjling et al., IJCV 2013]
R-CNN[Girshick et al., CVPR 2014]
- Region with CNN features
- Extract region (such as selective search) and warpping -> CNN features -> Classifier
- Traditional method for preprocessing : performance limitation
- Model prediction for every region proposal : heavy computation
Fast R-CNN[Girshick et al., CVPR 2014]
- Conv feature map (independent of original image size, extractor is not needed)
- RoI(Region of Interest) feature extraction and resample
- Region proposal is hand-crafted algorithm -> limited performance
Faster R-CNN[Ren etal., NeurIPS 2015]
- IoU = Intersection / Union (the higher, the better)
- Region proposal
- Anchor boxes(A set of pre-defined bounding boses)
- IoU between Ground truth proposed anchor box is the criteria of positive and negative
- Region Proposal Network(RPN)
- Non-Maximum Suppression (NMS)
One-stage(single-stage) detector
- [Ndonhon et al., offshore Technology Conference 2019]
- No explicit RoI pooling
- You only look once (YOLO)[Redmon et al ., CVPR 2016]
- Single Shot MultiBox Detector (SSD)[Liu et al., ECCV 2016]
Two-stage vs. One-stage
- Focal loss
- Class imbalance problem on Single-stage detector (# of negative anchor boxes >> # of positive boxes)
- Improved cross entropy loss
Detector with Transformer
- ViT by Google
- DeiT by Facebook
- DETR[Carion et al., ECCV 2020]
- objective query : Learned positional encodings for querying
###
피어세션
https://www.notion.so/Bilinear-resize-convolution-c893a921898f4987aded25f85674c730
'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글
[Lv2 P-Stage] Object Detection Overview (0) | 2021.09.27 |
---|---|
[CV] Further topics of segmentation (0) | 2021.09.19 |
[CV] Semantic segmentation (0) | 2021.09.09 |
[CV] Image Classification 2 (0) | 2021.09.08 |
[CV] 0. Computer Vision OT & 1. Image Classification 1 & 2. Annotation data efficient learning (0) | 2021.09.07 |
댓글