keras 迁移学习, 微调, model的predict函数定义

点击这里:猫狗大战keras实例

def add_new_last_layer(base_model, nb_classes):
  """Add last layer to the convnet
  Args:
    base_model: keras model excluding top
    nb_classes: # of classes
  Returns:
    new keras model with last layer
  """
  x = base_model.output
  x = GlobalAveragePooling2D()(x)
  x = Dense(FC_SIZE, activation='relu')(x) 
  predictions = Dense(nb_classes, activation='softmax')(x) 
  model = Model(input=base_model.input, output=predictions)
  return model

载入预训练模型作为前端的网络,在自己的数据集上进行微调,最好按照以下两步进行:

  1. Transfer learning:freeze all but the penultimate layer and re-train the lastDense layer
  2. Fine-tuning:un-freeze the lower convolutional layers and retrain more layers

Doing both, in that order, will ensure a more stable and consistent training. This is because the large gradient updates triggered by randomly initialized weights could wreck the learned weights in the convolutional base if not frozen. Once the last layer has stabilized (transfer learning), then we move onto retraining more layers (fine-tuning).

Transfer learning
def setup_to_transfer_learn(model, base_model):
  """Freeze all layers and compile the model"""
  for layer in base_model.layers:
    layer.trainable = False
  model.compile(optimizer='rmsprop',    
                loss='categorical_crossentropy', 
                metrics=['accuracy'])
Fine-tune
def setup_to_finetune(model):
   """Freeze the bottom NB_IV3_LAYERS and retrain the remaining top 
      layers.
   note: NB_IV3_LAYERS corresponds to the top 2 inception blocks in 
         the inceptionv3 architecture
   Args:
     model: keras model
   """
   for layer in model.layers[:NB_IV3_LAYERS_TO_FREEZE]:
      layer.trainable = False
   for layer in model.layers[NB_IV3_LAYERS_TO_FREEZE:]:
      layer.trainable = True
   model.compile(optimizer=SGD(lr=0.0001, momentum=0.9),   
                 loss='categorical_crossentropy')

When fine-tuning, it’s important to lower your learning rate relative to the rate that was used when training from scratch (lr=0.0001), otherwise, the optimization could destabilize and the loss diverge.

Training

Now we’re all set for training. Usefit_generator for both transfer learning and fine-tuning. 分两个阶段依次进行训练

history = model.fit_generator(
  train_generator,
  samples_per_epoch=nb_train_samples,
  nb_epoch=nb_epoch,
  validation_data=validation_generator,
  nb_val_samples=nb_val_samples,
  class_weight='auto')
model.save(args.output_model_file)


在keras2.0版本以上时,函数参数做了改变

    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for feature-wise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(x_train)

    # Fit the model on the batches generated by datagen.flow().
    model.fit_generator(datagen.flow(x_train, y_train,
                                     batch_size=batch_size),
                        steps_per_epoch=x_train.shape[0] // batch_size,  
                        epochs=epochs,
                        validation_data=(x_test, y_test))

预测函数:

def predict(model, img, target_size, top_n=3):
  """Run model prediction on image
  Args:
    model: keras model
    img: PIL format image
    target_size: (width, height) tuple
    top_n: # of top predictions to return
  Returns:
    list of predicted labels and their probabilities
  """
  if img.size != target_size:
    img = img.resize(target_size)
  x = image.img_to_array(img)
  x = np.expand_dims(x, axis=0)   
# 插入这一个轴是关键,因为keras中的model的tensor的shape是(bath_size, h, w, c),如果是tf后台
  x = preprocess_input(x)
  preds = model.predict(x)
  return decode_predictions(preds, top=top_n)[0]

<think>好的,我现在需要帮助用户了解如何使用卷积神经网络进行迁移学习并提取特征。首先,我应该回顾用户提供的引用内容,确保回答准确。用户的问题集中在迁移学习的三个关键词:CNN、迁移学习、特征提取。引用中的资料提到,迁移学习可以通过特征提取和微调两种方法实现。特征提取时,使用预训练模型作为固定特征提取器,然后在提取的特征上训练新的分类器,如SVM或逻辑回归。而微调则是调整部分网络层的权重。接下来,我需要分步骤说明方法。首先,预训练模型的选择,比如VGG、ResNet。然后,移除顶层,保留卷积基作为特征提取器。接着,提取特征,生成特征向量,再训练分类器。用户可能想知道具体实现,所以需要提供代码示例,比如使用Keras的VGG16模型。还要注意用户提到的系统指令,比如数学公式用$...$,独立公式用$$...$$,但当前问题可能不需要太多数学公式,但需要确保LaTeX正确。例如,提到卷积操作时可以用$W*X+ b$这样的内联公式。然后,生成相关问题需要考虑用户可能的后续疑问,比如特征提取与微调的区别,如何选择预训练模型,实际应用场景等。需要确保问题相关且有帮助。 最后,检查引用是否正确插入,比如引用[^1]到[^5],但用户提供的引用有些可能不直接相关,需要合理引用。例如,特征提取的好处引用[^1],微调的优势引用[^2],代码示例参考引用[^3][^4]。可能需要调整结构,确保步骤清晰,代码示例正确,并且引用恰当。同时,注意用户可能对代码实现感兴趣,所以提供Python代码片段,使用Keras库加载预训练模型,提取特征,训练分类器。总结回答结构:方法概述、步骤详解、实现示例(代码)、应用场景、注意事项,最后相关问题。确保每个部分都参考了用户提供的引用内容,并且回答符合中文要求,结构清晰。</think>### 卷积神经网络迁移学习与特征提取方法 #### 一、核心方法概述 迁移学习通过预训练模型的卷积基(Convolutional Base)提取非线性特征,再结合线性分类器完成目标任务[^1][^4]。具体可分为两种模式: 1. **固定特征提取**:冻结预训练模型权重,仅用其卷积层输出作为特征向量,训练新分类器(如SVM、逻辑回归)[^1][^4] 2. **微调(Fine-tuning)**:解冻部分高层卷积层,调整权重以适应新任务[^2][^4] 数学表达上,若预训练模型卷积基为函数$f(\cdot)$,输入图像$X$的特征向量可表示为: $$ \mathbf{h} = f(X) $$ 随后通过线性分类器$g(\mathbf{h}) = \mathbf{W}\mathbf{h} + \mathbf{b}$完成分类[^1] --- #### 二、实现步骤详解 1. **选择预训练模型** - 常用模型:VGG16、ResNet50、InceptionV3(ImageNet预训练) - 示例代码加载模型: ```python from keras.applications import VGG16 base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224,224,3)) ``` 2. **特征提取器构建** - 移除原模型全连接层(`include_top=False`),保留卷积基 - 冻结卷积层权重防止训练更新: ```python for layer in base_model.layers: layer.trainable = False # 冻结所有卷积层[^4] ``` 3. **特征提取与分类** - 提取特征向量: ```python features = base_model.predict(train_images) # 输出形状为(N, 7, 7, 512)[^3] features_flatten = features.reshape((features.shape[0], 7*7*512)) ``` - 训练线性分类器: ```python from sklearn.svm import SVC svm = SVC(kernel='linear') svm.fit(features_flatten, train_labels) ``` --- #### 三、应用场景与优势 | 场景 | 适用方法 | 优势 | |------|----------|------| | 小数据集(<1k样本) | 固定特征提取 | 避免过拟合,计算效率高[^1][^4] | | 中等数据集(1k-10k) | 微调顶层+特征提取 | 平衡特征通用性与特异性[^2] | | 与源任务相似的大数据集 | 全网络微调 | 最大化模型适应性[^4] | --- #### 四、关键注意事项 1. 输入数据需与预训练模型规格一致(如VGG16要求224×224 RGB) 2. 特征维度压缩后需进行标准化(如PCA或归一化) 3. 微调时应**逐层解冻**,例如先解冻最后两个卷积块[^4] ```python # 微调示例:解冻后两个卷积块 base_model.trainable = True for layer in base_model.layers[:-4]: # 冻结前16层 layer.trainable = False model.compile(optimizer=keras.optimizers.Adam(1e-5)) # 使用更小的学习率[^4] ``` ---
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值