[pytorch]yolov3.cfg参数详解(每层输出及route、yolo、shortcut层详解)


代码链接pytorch yolov3
yolov3.cfg参数,每层输出详解,希望帮助新手理解。大佬如果发现错误,请指出来,我及时改正!
(刚开始用md,排版不舒服请见谅。)
图像输入:width=416,height=416,channels=3
no.x序号:卷积层计数
[层名]x序号:route时会用到的计数

Backbone(Darknet53)

#(no.1)conv+bn+leakyrelu

#out_put_shape: 32 × 416 × 416 32\times416\times416 32×416×416

[convolutional]0
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

第一次下采样(to 208)

#Downsample(stride=2)

#(no.2)conv+bn+leakyrelu

#out_put_shape: 64 × 208 × 208 64\times208\times208 64×208×208

[convolutional]1
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky

#(no.3)conv+bn+leaky

#out_put_shape: 32 × 208 × 208 32\times208\times208 32×208×208

[convolutional]2
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

#(no.4)conv+bn+leaky

#out_put_shape: 64 × 208 × 208 64\times208\times208 64×208×208

[convolutional]3
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

#(shortcut_1)上一卷积层(no.4( 64 × 208 × 208 64\times208\times208 64×208×208)与上三卷积层前(no.2( 64 × 208 × 208 64\times208\times208 64×208×208))的特征融合

#out_put_shape: 64 × 208 × 208 64\times208\times208 64×208×208

[shortcut]4
from=-3
activation=linear



第二次下采样(to 104)

#Downsample

#(no.5)conv+bn+leaky

#out_put_shape: 128 × 104 × 104 128\times104\times104 128×104×104

[convolutional]5
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

#(no.6)conv+bn+leaky

#out_put_shape: 64 × 104 × 104 64\times104\times104 64×104×104

[convolutional]6
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

#(no.7)conv+bn+leaky

#out_put_shape: 128 × 104 × 104 128\times104\times104 128×104×104

[convolutional]7
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

#同理,out_put_shape: 128 × 104 × 104 128\times104\times104 128×104×104

[shortcut]8
from=-3
activation=linear



#(no.8)conv+bn+leaky

#out_put_shape: 64 × 104 × 104 64\times104\times104 64×104×104

[convolutional]9
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

#(no.9)conv+bn+leaky

#out_put_shape: 128 × 104 × 104 128\times104\times104 128×104×104

[convolutional]10
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

#同理,这次是和上一个shortcut输出连接,out_put_shape: 128 × 104 × 104 128\times104\times104 128×104×104

[shortcut]11
from=-3
activation=linear


至此,(128,104,104)特征图通过(降维——卷积——残差)的方式处理了2次。


第三次下采样(to 52)

#Downsample

#(no.10)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]12
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky

#(no.11)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]13
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.12)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]14
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.12卷积层和no.10卷积层连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]15
from=-3
activation=linear



#(no.13)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]16
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.14)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]17
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.14卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]18
from=-3
activation=linear



#(no.15)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]19
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.16)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]20
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.16卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]21
from=-3
activation=linear



#(no.17)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]22
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.18)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]23
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.18卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]24
from=-3
activation=linear



#(no.19)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]25
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.20)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]26
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.20卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]27
from=-3
activation=linear



#(no.21)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]28
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.22)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]29
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.22卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]30
from=-3
activation=linear



#(no.23)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]31
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.24)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]32
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.24卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]33
from=-3
activation=linear



#(no.25)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]34
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.26)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]35
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

#同理,no.26卷积层和上一个shortcut连接,out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[shortcut]36
from=-3
activation=linear


至此(256,52,52)的特征图同样的方式(降维——卷积——残差)处理了8次。


第四次下采样(to 26)

#Downsample

#(no.27)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]37
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky

#(no.28)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]38
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.29)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]39
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.29卷积层和no.27卷积层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]40
from=-3
activation=linear



#(no.30)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]41
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.31)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]42
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.31卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]43
from=-3
activation=linear



#(no.32)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]44
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.33)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]45
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.33卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]46
from=-3
activation=linear



#(no.34)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]47
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.35)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]48
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.35卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]49
from=-3
activation=linear



#(no.36)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]50
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.37)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]51
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.37卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]52
from=-3
activation=linear



#(no.38)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]53
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.39)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]54
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.39卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]55
from=-3
activation=linear



#(no.40)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]56
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.41)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]57
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.41卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]58
from=-3
activation=linear



#(no.42)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]59
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.43)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]60
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

#同理,no.43卷积层和上一个shortcut层连接,out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[shortcut]61
from=-3
activation=linear


至此,(512,26,26)的特征图被同样的方式(降维——卷积——残差)处理了8次。


第五次下采样(to 13)

#Downsample

#(no.44)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=leaky

#(no.45)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.46)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

#同理,no.46卷积层和no.44卷积层连接,out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[shortcut]
from=-3
activation=linear



#(no.47)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.48)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

#同理,no.48卷积层和上一个shortcut层连接,out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[shortcut]
from=-3
activation=linear



#(no.49)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.50)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

#同理,no.50卷积层和上一个shortcut层连接,out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[shortcut]
from=-3
activation=linear



#(no.51)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.52)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

#同理,no.52卷积层和上一个shortcut层连接,out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[shortcut]
from=-3
activation=linear


终于,(1024,13,13)四次处理之后,一堆井号意味着backbone结束了。


######################

YOLOLayer

第一层yolo层

#(no.53)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.54)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

#(no.55)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.56)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

#(no.57)conv+bn+leaky

#out_put_shape: 512 × 13 × 13 512\times13\times13 512×13×13

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

#(no.58)conv+bn+leaky

#out_put_shape: 1024 × 13 × 13 1024\times13\times13 1024×13×13

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

#(no.59)conv+bn+leaky

#这是yolo层的前一个卷积层,要把特征图变成( a n c h o r × ( c l a s s + 5 ) , 13 , 13 anchor\times(class+5),13,13 anchor×(class+5),13,13)喂入yolo层

#out_put_shape: 255 × 13 × 13 255\times13\times13 255×13×13

[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear

#第一个yolo层
#mask=678意味着使用最后三个anchor,num=9意思是每个grid预测9个bbox,jitter数据抖动,产生更多数据(属于TTA,Test Time Augmentation),ignore指IOU的阈值大小,小尺度用0.7,大尺度用0.5。
#输出为(3,13,13,85)把255分为anchor和类别加位置信息等。

[yolo]
mask = 6,7,8
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=80
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

第二层yolo层

#route层,意思是提取当前倒数第四个层的输出,即no.57卷积层(512,13,13)。

[route]
layers = -4

#(no.60)conv+bn+leaky

#out_put_shape: 256 × 13 × 13 256\times13\times13 256×13×13

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#上采样层,输出为(256,26,26)

[upsample]
stride=2

#route层(双参数):是指上一层即上采样层(256,26,26)和第61层(512,26,26)(也就是第四次下采样后残差处理过后的最终结果)按第一个维度相加得到(具体序号看[方括号]后的标号)往前找!
#所以,这一层的输出是(768,26,26)

[route]
layers = -1, 61

#(no.61)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.62)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

#(no.63)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.64)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

#(no.65)conv+bn+leaky

#out_put_shape: 256 × 26 × 26 256\times26\times26 256×26×26

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#(no.66)conv+bn+leaky

#out_put_shape: 512 × 26 × 26 512\times26\times26 512×26×26

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

#(no.67)conv+bn+leaky

#out_put_shape: 255 × 26 × 26 255\times26\times26 255×26×26

[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear

#yolo层,解释见上。输出为(3,26,26,85)

[yolo]
mask = 3,4,5
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=80
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

第三层yolo层

#route层,提取当前倒数第四层特征,即no.65卷积层(256,26,26)

[route]
layers = -4

#(no.68)conv+bn+leaky

#out_put_shape: 128 × 26 × 26 128\times26\times26 128×26×26

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#上采样层,输出(128,52,52)

[upsample]
stride=2

#route层(双参数):是指上一层即上采样层(128,52,52)和第36层(256,52,52)(也就是第四次下采样后残差处理过后的最终结果)按第一个维度相加得到(具体序号看[方括号]后的标号)往前找!
#此层输出为(384,52,52)

[route]
layers = -1, 36

#(no.69)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.70)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

#(no.71)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.72)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

#(no.73)conv+bn+leaky

#out_put_shape: 128 × 52 × 52 128\times52\times52 128×52×52

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

#(no.74)conv+bn+leaky

#out_put_shape: 256 × 52 × 52 256\times52\times52 256×52×52

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

#(no.75)conv+bn+leaky

#out_put_shape: 255 × 52 × 52 255\times52\times52 255×52×52

[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear

#yolo层,解释见上。输出为(3,52,52,85)

[yolo]
mask = 0,1,2
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=80
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

完结撒花

至此,yolov3.cfg解析完毕!!!撒花!

### 使用PyTorch实现YOLOv4模型构建 #### 1. 准备工作 为了使用PyTorch搭建YOLOv4模型,首先需要安装必要的依赖库。可以利用`pip`来完成这些操作。 ```bash pip install torch torchvision torchaudio opencv-python matplotlib tqdm ``` #### 2. 下载预训练权重与配置文件 获取官方发布的YOLOv4 Darknet格式的预训练权重以及对应的`.cfg`配置文件对于快速启动项目非常重要。可以从Darknet官方网站或其他可靠资源处下载这些文件[^2]。 #### 3. 转换Darknet到PyTorch 由于原始的YOLOv4是由Darknet框架开发出来的,因此有必要编写一段脚本来将Darknet模型转换成PyTorch可读取的形式。这通常涉及到解析`.cfg`文件并将相应的映射至PyTorch中的模块。 ```python import torch.nn as nn class ConvBnLeakyReLU(nn.Module): """Convolution followed by batch normalization and Leaky ReLU activation""" def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=None, dilation=1, groups=1, bias=True, eps=1e-5, momentum=0.1): super(ConvBnLeakyReLU, self).__init__() if not padding and kernel_size != 1: padding = (kernel_size - 1) // 2 * dilation self.conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=bias) self.bn = nn.BatchNorm2d(out_channels, eps=eps, momentum=momentum) def forward(self, x): return nn.LeakyReLU(0.1)(self.bn(self.conv(x))) ``` 这段代码展示了如何创建一个带有批标准化和泄漏整流线性单元激活函数的基础卷积块,这是YOLO架构中常见的组件之一。 #### 4. 定义YOLOv4网络结构 接下来就是按照YOLOv4论文描述的方式定义整个神经网络了。考虑到篇幅原因,在这里只给出部分核心代码片段作为示意: ```python def create_modules(module_defs): """ Constructs module list of layer blocks from module configuration in module_defs. Args: module_defs (list): List containing dictionary with each block's parameters Returns: net_info(dict), module_list(list): Network information and a list of modules built according to the config file """ hyperparams = module_defs.pop(0) output_filters = [int(hyperparams["channels"])] module_list = nn.ModuleList() for i, module_def in enumerate(module_defs): modules = nn.Sequential() if module_def['type'] == 'convolutional': bn = int(module_def.get('batch_normalize', 0)) filters = int(module_def['filters']) kernel_size = int(module_def['size']) pad = (kernel_size - 1) // 2 if int(module_def.get('pad', 0)) else 0 modules.add_module(f'conv_{i}', nn.Conv2d( in_channels=output_filters[-1], out_channels=filters, kernel_size=kernel_size, stride=int(module_def['stride']), padding=pad, bias=not bn)) if bn: modules.add_module(f'batch_norm_{i}', nn.BatchNorm2d(filters, momentum=0.9, eps=1e-5)) if module_def.get('activation') == "leaky": modules.add_module(f'leaky_{i}', nn.LeakyReLU(0.1, inplace=True)) elif module_def['type'] == 'shortcut': ... # Add more types like upsample, route etc. module_list.append(modules) output_filters.append(filters) return hyperparams, module_list ``` 上述代码实现了从给定的配置列表中动态生成各个次的功能,并将其加入到`nn.ModuleList()`对象里去。注意这里的省略号代表还需要继续补充其他类型的(比如跳跃连接、上采样等),具体细节可以根据实际需求调整。 #### 5. 加载预训练参数 一旦完成了模型的设计之后就可以加载之前准备好的预训练权重了。这部分逻辑相对简单一些,主要是通过遍历两个不同框架之间的变量名来进行匹配赋值。 ```python model.load_state_dict(torch.load(pretrained_weights_path)) ``` 当然如果遇到名称不一致的情况可能需要用到更复杂的映射策略。 ---
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值