Effecient Net
develop a new baseline network, obtain a family of models, called EfficientNets
Critical to balance all dimensions of network width/depth/resolution
以一个固定的系数来整合网络的宽度 深度 图片解析度
探究模型影响因素
study model scaling and identify that carefully balancing newwork
-
depth: layers
- 好处 deeper ConvNet can capture richer and more complex features
- 坏处 difficult to train due to the vanishing gradient
尽管有- skip connections
- batch normalization
-
width: channel
Scaling network width is commonly used for small size models- 好处 wider networks tend to be able to capture more fine-grained features and are easier to train
- 坏处 extremely wide but shallow networks tend to have difficulties in capturing higher level features
-
image resolution
- 好处:capture more fine-grained patterns
- 坏处:速度慢
思路
Our target is to maximize the model accuracy for any given resource constraints, which can be formulated as an optimization problem
Max(d,w,r):Accuracy
Memory(N) < traget_memory
FLOPS(N) < target_flops
方法
复合放缩法(compound scaling method)
用一个复合系数去统一的放缩网络的width depth resolution
φ 是一个用户特定的系数来指定资源的总数量
α, β, γ 代表分别分配给width(w2w^2w2) depth(d) resolution(r2r^2r2)的数量
例如限制α∗β2∗γ2≈2α * β^2 * γ^2 ≈ 2α∗β2∗γ2≈2, 求解 α, β, γ