- 数据库:MNIST,与这里对比
- tf.nn.depthwise_conv2d的理解看这里,主要是对卷积核参数的理解,即(高度,宽度,输入通道,每个通道得到的输出通道数)
- 训练速度慢,收敛也慢,刚开始就像没训练的样子,只将一个卷积层改成深度可分离卷积就增加了12次迭代
- 20211213:将第一个卷积层改成深度可分离卷积后,训练始终不收敛。所以改成下列代码,多训练了16次,模型减小了6.85%。使用sp_conv也会不收敛,可能是batch_normalization的原因
- 20211214:破案了,是relu的锅,注释以后,多训练了28次,看网上说好像要
减小学习率
import tensorflow as tf
import numpy as np
import random
import cv2,sys,os
import MyData
def sp_conv(name, data, kernel_size, input_num, output_num, padding, data_format='NHWC'):
with tf.variable_scope(name):
weight = tf.get_variable(name='weight', dtype=tf.float32, trainable=True, shape=[kernel_size,kernel_size,input_num,1], initializer=tf.random_normal_initializer(stddev=0.01))
conv = tf.nn.depthwise_conv2d(data, weight, [1,1,1,1], padding, data_format=data_format)
conv = tf.layers.batch_normalization(conv, momentum=0.9)
point_weight = tf.get_variable(name='point_weight', dtype=tf.float32, trainable=True, shape=[1,1,input_num,output_num], initializer=tf.random_normal_initializer(stddev=0.01))
conv = tf.nn.conv2d(conv, point_weight, [1,1,1,1], padding, data_format=data_format)
conv = tf.layers.batch_normalization(conv, momentum=0.9)