keras 的 example 文件 mnist_net2net.py 解析

该程序是介绍,如何把一个浅层的卷积神经网络,加深,加宽

如先建立一个简单的神经网络,结构如下:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv1 (Conv2D)               (None, 28, 28, 64)        640
_________________________________________________________________
pool1 (MaxPooling2D)         (None, 14, 14, 64)        0
_________________________________________________________________
conv2 (Conv2D)               (None, 14, 14, 64)        36928
_________________________________________________________________
pool2 (MaxPooling2D)         (None, 7, 7, 64)          0
_________________________________________________________________
flatten (Flatten)            (None, 3136)              0
_________________________________________________________________
fc1 (Dense)                  (None, 64)                200768
_________________________________________________________________
fc2 (Dense)                  (None, 10)                650
=================================================================
Total params: 238,986
Trainable params: 238,986
Non-trainable params: 0
_________________________________________________________________
None

训练完成后,想办法把他加宽,成下面这样

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv1 (Conv2D)               (None, 28, 28, 128)       1280
_________________________________________________________________
pool1 (MaxPooling2D)         (None, 14, 14, 128)       0
_________________________________________________________________
conv2 (Conv2D)               (None, 14, 14, 64)        73792
_________________________________________________________________
pool2 (MaxPooling2D)         (None, 7, 7, 64)          0
_________________________________________________________________
flatten (Flatten)            (None, 3136)              0
_________________________________________________________________
fc1 (Dense)                  (None, 128)               401536
_________________________________________________________________
fc2 (Dense)                  (None, 10)                1290
=================================================================
Total params: 477,898
Trainable params: 477,898
Non-trainable params: 0
_________________________________________________________________
None

或者加深,变成下面这样

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv1 (Conv2D)               (None, 28, 28, 64)        640
_________________________________________________________________
pool1 (MaxPooling2D)         (None, 14, 14, 64)        0
_________________________________________________________________
conv2 (Conv2D)               (None, 14, 14, 64)        36928
_________________________________________________________________
conv2-deeper (Conv2D)        (None, 14, 14, 64)        36928
_________________________________________________________________
pool2 (MaxPooling2D)         (None, 7, 7, 64)          0
_________________________________________________________________
flatten (Flatten)            (None, 3136)              0
_________________________________________________________________
fc1 (Dense)                  (None, 64)                200768
_________________________________________________________________
fc1-deeper (Dense)           (None, 64)                4160
_________________________________________________________________
fc2 (Dense)                  (None, 10)                650
=================================================================
Total params: 280,074
Trainable params: 280,074
Non-trainable params: 0
_________________________________________________________________
None

也就是介绍如何对神经网络参数进行增、改、查

首先是获取参数,获取卷积层参数和全连接层代码就是下面两行:

    w_conv1, b_conv1 = teacher_model.get_layer('conv1').get_weights()
    w_fc1, b_fc1 = teacher_model.get_layer('fc1').get_weights()

加宽的话,修改卷积层和全连接层参数是下面两行:

    model.get_layer('conv1').set_weights([new_w_conv1, new_b_conv1])
    model.get_layer('fc1').set_weights([new_w_fc1, new_b_fc1])

至于改成什么数据,那就自己可以自由发挥了,要么在原来的基础上,拼接随机的一些层,要么把原来的复制一份然后加一些噪音

 

加深的话,就是新建一个神经网络,把原有的层的参数获取重新拷贝过去就行了,新增加的层的参数,可以自由发挥如何初始化,

 

修改后的神经网络重新再进行训练

``` import tensorflow as tf from keras import datasets, layers, models import matplotlib.pyplot as plt # 导入mnist数据,依次分别为训练集图片、训练集标签、测试集图片、测试集标签 (train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data() # 将像素的值标准化至0到1的区间内。(对于灰度图片来说,每个像素最大值是255,每个像素最小值是0,也就是直接除以255就可以完成归一化。) train_images, test_images = train_images / 255.0, test_images / 255.0 # 查看数据维数信息 print(train_images.shape,test_images.shape,train_labels.shape,test_labels.shape) #调整数据到我们需要的格式 train_images = train_images.reshape((60000, 28, 28, 1)) test_images = test_images.reshape((10000, 28, 28, 1)) print(train_images.shape,test_images.shape,train_labels.shape,test_labels.shape) train_images = train_images.astype("float32") / 255.0 def image_to_patches(images, patch_size=4): batch_size = tf.shape(images)[0] patches = tf.image.extract_patches( images=images[:, :, :, tf.newaxis], sizes=[1, patch_size, patch_size, 1], strides=[1, patch_size, patch_size, 1], rates=[1, 1, 1, 1], padding="VALID" ) return tf.reshape(patches, [batch_size, -1, patch_size*patch_size*1]) class TransformerBlock(tf.keras.layers.Layer): def __init__(self, embed_dim, num_heads): super().__init__() self.att = tf.keras.layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim) self.ffn = tf.keras.Sequential([ tf.keras.layers.Dense(embed_dim*4, activation="relu"), tf.keras.layers.Dense(embed_dim) ]) self.layernorm1 = tf.keras.layers.LayerNormalization() self.layernorm2 = tf.keras.layers.LayerNormalization() def call(self, inputs): attn_output = self.att(inputs, inputs) out1 = self.layernorm1(inputs + attn_output) ffn_output = self.ffn(out1) return self.layernorm2(out1 + ffn_output) class PositionEmbedding(tf.keras.layers.Layer): def __init__(self, max_len, embed_dim): super().__init__() self.pos_emb = tf.keras.layers.Embedding(input_dim=max_len, output_dim=embed_dim) def call(self, x): positions = tf.range(start=0, limit=tf.shape(x)[1], delta=1) return x + self.pos_emb(positions) def build_transformer_model(): inputs = tf.keras.Input(shape=(49, 16)) # 4x4 patches x = tf.keras.layers.Dense(64)(inputs) # 嵌入维度64 # 添加位置编码 x = PositionEmbedding(max_len=49, embed_dim=64)(x) # 堆叠Transformer模块 x = TransformerBlock(embed_dim=64, num_heads=4)(x) x = TransformerBlock(embed_dim=64, num_heads=4)(x) # 分类头 x = tf.keras.layers.GlobalAveragePooling1D()(x) outputs = tf.keras.layers.Dense(10, activation="softmax")(x) return tf.keras.Model(inputs=inputs, outputs=outputs) model = build_transformer_model() model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]) # 数据预处理 train_images_pt = image_to_patches(train_images[..., tf.newaxis]) test_images_pt = image_to_patches(test_images[..., tf.newaxis]) history = model.fit( train_images_pt, train_labels, validation_data=(test_images_pt, test_labels), epochs=10, batch_size=128 )```Exception has occurred: NotImplementedError Layer PositionEmbedding has arguments ['self', 'max_len', 'embed_dim'] in `__init__` and therefore must override `get_config()`. Example: class CustomLayer(keras.layers.Layer): def __init__(self, arg1, arg2): super().__init__() self.arg1 = arg1 self.arg2 = arg2 def get_config(self): config = super().get_config() config.update({ "arg1": self.arg1, "arg2": self.arg2, }) return config File "D:\source\test3\transform.py", line 129, in <module> model.save('transform_model.keras') NotImplementedError: Layer PositionEmbedding has arguments ['self', 'max_len', 'embed_dim'] in `__init__` and therefore must override `get_config()`.
03-12
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值