大范围的位移和缩放,还是需要fully connected layer,通过学习大量不同位置和形状比例的物体,来支持物体位置和缩放的识别。
参考资料:
https://www.quora.com/How-is-a-convolutional-neural-network-able-to-learn-invariant-features
https://stats.stackexchange.com/questions/208936/what-is-translation-invariance-in-computer-vision-and-convolutional-netral-netwo
1、The pooling regimes make convolution process invariant to translation, rotation and shifting. Most widely used one is max-pooling. You take the highest activation to propagate at the interest region so called receptive field.
Even a images are relatively a little shifted, since we are looking for highest activation, we are able to capture commonalities between images.
2、For scale invariance, up to my knowledge, no way other than providing different scales of images to network or learned network filters might be applied at different scales.
3、Other forms of invariances are built up artificially by rotating, mirroring and scaling up the training examples. This is because it is important to see training sets from different points of view in order to generalize better.
本文探讨了卷积神经网络如何实现平移、旋转和尺度不变性。通过池化层,网络能对输入的微小变化保持鲁棒性;而为了应对更大范围的变化,则需通过全连接层学习不同位置和比例的物体特征。
11万+

被折叠的 条评论
为什么被折叠?



