1 Going deeper with convolutions
- The deeper model is the better model because of larger receptive fields.
- The deeper model is the better model because of larger infomation capacity and non-linearity.
- Is it real? -> harder to optimize (gradient vanishing, exploding, degradation problem)
2.1 GoogLeNet[Szegedy et al., CVPR 2015]
- deeper and wider convolution architecture.
- channelwise compression with 1x1 convolution (bottle neck).
- Using auxiliary classifiers in tarining phase to avoid gradient vanishing.
2.2 ResNet[He et al., CVPR 2016]
- The deeper model is the better model. It is real.
degradation problem : with the network depth increasing, accuracy gets saturated and then degrades rapidly. degradation problem is not caused by overfitting.
- Residual block : leaning residual function instead of target function.
- Residual networks have $O(2^n)$ implicit paths.
- He initialization is used to make properly small values in residual connections.
2.3 Beyond ResNet
- DenseNet[Huang et al., CVPR 2017]
* In the Denseblocks, every output of each layer is concatenated along channel axis.
- SENet[Hu et al., CVPR 2018]
* Attention across channels
* Squeeze and excitation operations
- EfficientNet[Tan and Le, ICML 2019]
* width scaling (like GoogLeNet)
* depth scaling (like ResNet)
* resolution scaling (high resolution input)
* sum of all => compound scaling
- Deformable convolution
* 2D spatial offset prediction for irregular convolution
3. Summary[Canziani et al., CVPR 2016]
- GoogLeNet is most efficient architecture. (2016)
- VGG, ReNet are tipycally uesd a backbone model for may task.
###
피어세션
https://www.notion.so/ResNet-159ab28346904e6eb1758fd21a17beea
'딥러닝 머신러닝 데이터 분석 > BoostCampAITech' 카테고리의 다른 글
[CV] Object detection (0) | 2021.09.12 |
---|---|
[CV] Semantic segmentation (0) | 2021.09.09 |
[CV] 0. Computer Vision OT & 1. Image Classification 1 & 2. Annotation data efficient learning (0) | 2021.09.07 |
[P-Stage] 마스크 데이터 분류 대회 리포트 - 2 (0) | 2021.09.03 |
[P-Stage] 마스크 데이터 분류 대회 리포트 - 1 (0) | 2021.09.03 |
댓글