본문 바로가기
딥러닝 머신러닝 데이터 분석/BoostCampAITech

[CV] Image Classification 2

by SteadyForDeep 2021. 9. 8.
반응형

1 Going deeper with convolutions
  - The deeper model is the better model because of larger receptive fields.

  - The deeper model is the better model because of larger infomation capacity and non-linearity.

  - Is it real? -> harder to optimize (gradient vanishing, exploding, degradation problem)

 

2.1 GoogLeNet[Szegedy et al., CVPR 2015]

  - deeper and wider convolution architecture.

  - channelwise compression with 1x1 convolution (bottle neck).

  - Using auxiliary classifiers in tarining phase to avoid gradient vanishing.

 

2.2 ResNet[He et al., CVPR 2016]

  - The deeper model is the better model. It is real.

degradation problem : with the network depth increasing, accuracy gets saturated and then degrades rapidly. degradation problem is not caused by overfitting.

  - Residual block : leaning residual function instead of target function.

  - Residual networks have $O(2^n)$ implicit paths.

  - He initialization is used to make properly small values in residual connections.

 

2.3 Beyond ResNet

  - DenseNet[Huang et al., CVPR 2017]

    * In the Denseblocks, every output of each layer is concatenated along channel axis.

  - SENet[Hu et al., CVPR 2018]

    * Attention across channels

    * Squeeze and excitation operations

  - EfficientNet[Tan and Le, ICML 2019]

    * width scaling (like GoogLeNet)

    * depth scaling (like ResNet)

    * resolution scaling (high resolution input)

    * sum of all => compound scaling

  - Deformable convolution

    * 2D spatial offset prediction for irregular convolution

 

3. Summary[Canziani et al., CVPR 2016]

  - GoogLeNet is most efficient architecture. (2016)

  - VGG, ReNet are tipycally uesd a backbone model for may task.

 

###

 

피어세션

https://www.notion.so/ResNet-159ab28346904e6eb1758fd21a17beea

 

반응형

댓글