class SVHN_Model2(nn.Module):
def init(self):
super(SVHN_Model1, self).init()
model_conv = models.resnet18(pretrained=True)
model_conv.avgpool = nn.AdaptiveAvgPool2d(1)
model_conv = nn.Sequential(*list(model_conv.children())[:-1])
self.cnn = model_conv
self.fc1 = nn.Linear(512, 11)
self.fc2 = nn.Linear(512, 11)
self.fc3 = nn.Linear(512, 11)
self.fc4 = nn.Linear(512, 11)
self.fc5 = nn.Linear(512, 11)
def forward(self, img):
feat = self.cnn(img)
# print(feat.shape)
feat = feat.view(feat.shape[0], -1)
c1 = self.fc1(feat)
c2 = self.fc2(feat)
c3 = self.fc3(feat)
c4 = self.fc4(feat)
c5 = self.fc5(feat)
return c1, c2, c3, c4, c5
Here use the pretrained model, Let’s talk about how to use pretrained model.
Transfer Learning
Transfer learning is to transfer the trained model and parameters to another new model so that we do not need to retrain a new model from scratch.
For example, we can train a CNN for ImageNet, and then apply this trained model to other image classification data, or even use this model as a feature extraction tool to connect traditional SVM method.
In short, the main purpose of transfer learning is to solve the problems of data labeling difficulties and data acquisition.
Why is it called transfer learning? Because we transferred the parameter architecture and other information learned by other models to the current target problem to help us better solve the current problem.
What training set is the so-called other model built on? It can be expected that if the training set is very different from the training set of our target task, the migration effect may not be too good.
Layer Transfer
So, we can just use serveral layers. We freeze some layers and use them to extract the characterization, and train the rest layer with labelled data.
The problem of speech processing generally fixes the parameters of the last few layers, and optimizes the parameters of the first few layers. This is because individuals have different sounds due to differences in oral structure, etc., the same sounding method will get different sounds. What the several layers do is to extract the utterance from the sound, so the parameters of the first few layers will be different for different individuals; the things done in the latter layers are to get the recognition results by the way of occurrence, this part is universal, not Changes due to individual changes.
In contrast, in the task of image recognition, we usually fix the parameters of the first few layers and tune the parameters of the latter layers, because the first few layers of the neural network used for image recognition are used to extract simple features, such as lines and contours Etc., this part is almost suitable for various types of images; the latter layers combine the low-level features in some way to get the high-level features, and the combination method is different for specific recognition tasks.
Multitask Transfer
The idea of Multitask Learning is to let multiple tasks train the first few layers of the network at the same time, and train the last few layers of the network separately. For example, if Task A and Task B are similar in the above figure, they can be trained at the same time. For example, Task A and Task B are both image recognition tasks, one is used for cat and dog classification, and the other is used for elephant tiger classification, then we can let them share the first few parameters of the network.
What are the benefits of doing this? It can prevent overfitting to a certain extent, because multi-task training requires that the method of extracting features in the first few layers is applicable to multiple tasks at the same time, so it can improve the generalization performance. Of course, it can also be understood in this way. When training Task A, the training data of Task B can be regarded as a noise, and noise can improve the robustness of the network, so as to obtain a better generalization effect, even when training Task B.
In terms of speech recognition tasks, speech recognition in various languages can promote each other. According to teacher Li Hongyi, someone has done two or two combinations of dozens of languages and found that the performance of these language recognition tasks has been improved.
The following figure is an example of the performance improvement of Chinese (Mandarin) with the assistance of European language training: