用tensorflow实现usps和mnist数据集的迁移学习

本文介绍了一种基于《DeepTransferNetwork:UnsupervisedDomainAdaptation》论文的无监督域适应模型实现方法,该模型利用mnist和usps两个手写数字数据集进行训练,并详细介绍了数据预处理步骤及模型构建过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

本程序环境:tensorflow+python,用到的库:numpy,os,Image,random。

基于论文:《Deep Transfer Network: Unsupervised Domain Adaptation》

前面我已经对这篇文章做过简单的导读:【深度学习】论文导读:无监督域适应(Deep Transfer Network: Unsupervised Domain Adaptation)

首先我们要用到两个数据集usps和mnist,它们都是用来完成手写数字识别任务的,以下提供两个数据集的下载链接

mnist数据集下载

usps数据集下载

上面两个链接下载下来,会发现mnist是以图片存储,而usps是以数值方式存储。这也是为了程序方便。

一.数据准备

首先需要读入usps数据集,因为usps是简单的以字符形式存储,所以读入也比较方便,程序如下,最后返回traindata矩阵以及trainlabel矩阵

def read_usps_dataset():
    filename= '数据集文件路径'
    fr = open(filename)
    numberOfLines = len(fr.readlines())
    traindataMat = zeros((numberOfLines,256))        #prepare matrix to return
    trainlabelMat = zeros((numberOfLines),dtype=int32)
    fr = open(filename)
    index = 0
    for line in fr.readlines():
        line = line.strip()                     #delete the /r/n
        listFromLine = line.split(' ')
        trainlabelMat[index] = listFromLine[0]
        traindataMat[index, :] = float32(listFromLine[1:])
        index += 1
    print "the size of source dataset:",numberOfLines
    trainlabelMat=array(trainlabelMat)
    traindataMat=array(traindataMat)/float32(2)
    trainlabelMat.astype(int)
    return traindataMat,trainlabelMat
然后处理mnist,这部分稍微麻烦一点,我们之所以使用mnist图片,是为了方便缩小,因为我们需要mnist数据和usps一起输入到神经网络,它们的维度应该是一样的。usps是16×16的大小,而mnist是28×28的大小,这里需要将mnist图片缩小到16×16

def preprocess_mnist():
    image=[]
    label=[]
    i=0
    for labels in range(10):
        pathDir =os.listdir('MNIST/trainimage/pic2/'+str(labels)+'/')
        for allDir in pathDir:
            child = os.path.join('%s%s' % ('MNIST/trainimage/pic2/'+str(labels)+'/', allDir))
            img = Image.open(child)
            img=img.resize((16, 16))
            img_array=array(img)
            img_array=img_array[:,:,0]
            img_array=reshape(img_array,-1)
            image.append(img_array)
            label.append(labels)
            i=i+1
    image = array(image)/float32(256)
    label = array(label)
    print "the size of target dataset:",i
    return image,label
数据预处理部分还没有结束,别忘了把数值型的标签转化成01型的

def dense_to_one_hot(labels_dense, num_classes):
  num_labels = labels_dense.shape[0]
  index_offset = arange(num_labels) * num_classes
  labels_one_hot = zeros((num_labels, num_classes))
  labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
  return labels_one_hot

二.创建M矩阵

M矩阵用于计算empirical Maximum Mean Discrepancy(MMD),我们一开始就要将它初始化

def createMmetrix():
    mat1=tf.constant(float32(1)/(square(BATCHSIZE/2)),shape=[BATCHSIZE/2,BATCHSIZE/2],dtype=tf.float32)
    mat2=-mat1
    mat3=tf.concat(1,[mat1,mat2])
    mat4=tf.concat(1,[mat2,mat1])
    mat5=tf.concat(0,[mat3,mat4])
    return mat5

三.建立模型

采用两层卷积+全连接的结构,首先定义权值/偏置初始化函数,卷积/池化函数

def weight_variable(shape):
	initial = tf.truncated_normal(shape, stddev=0.1)
	return tf.Variable(initial)

def bias_variable(shape):
	initial = tf.constant(0.1, shape = shape)
	return tf.Variable(initial)

# convolution
def conv2d(x, W):
	return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
# pooling
def max_pool_2x2(x):
	return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
建立模型,注意obj_func的形式,我们已经将它替换成论文当中的公式了

#first convolutinal layer
w_conv1 = weight_variable([3, 3, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(X, [-1, 16, 16, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# second convolutional layer
w_conv2 = weight_variable([3, 3, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# densely connected layer
w_fc1 = weight_variable([4*4*64, 256])
b_fc1 = bias_variable([256])
h_pool2_flat = tf.reshape(h_pool2, [-1, 4*4*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)

# softmax layer
w_fc2 = weight_variable([256, 10])
b_fc2 = bias_variable([10])
y_conv = tf.nn.softmax(tf.matmul(h_fc1, w_fc2) + b_fc2)

obj_func= -tf.reduce_sum(Y * tf.log(y_conv))+tf.constant(lamda,dtype=tf.float32)*tf.trace(tf.matmul(tf.matmul(h_fc1,M,transpose_a=True),h_fc1))+tf.constant(miu,dtype=tf.float32)*tf.trace(tf.matmul(tf.matmul(y_conv,M,transpose_a=True),y_conv))
optimizer = tf.train.GradientDescentOptimizer(learningrate).minimize(obj_func)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))


``` from tensorflow.keras.datasets import mnist, usps from tensorflow.keras.models import Model from tensorflow.keras.layers import Dense, Input # 加载MNISTUSPS数据集 (mnist_train_images, mnist_train_labels), (mnist_test_images, mnist_test_labels) = mnist.load_data() (usps_train_images, usps_train_labels), (usps_test_images, usps_test_labels) = usps.load_data() # 数据预处理 mnist_train_images = mnist_train_images.reshape(-1, 28*28).astype('float32') / 255 mnist_test_images = mnist_test_images.reshape(-1, 28*28).astype('float32') / 255 usps_train_images = usps_train_images.reshape(-1, 28*28).astype('float32') / 255 usps_test_images = usps_test_images.reshape(-1, 28*28).astype('float32') / 255 mnist_train_labels = tf.keras.utils.to_categorical(mnist_train_labels, 10) mnist_test_labels = tf.keras.utils.to_categorical(mnist_test_labels, 10) usps_train_labels = tf.keras.utils.to_categorical(usps_train_labels, 10) usps_test_labels = tf.keras.utils.to_categorical(usps_test_labels, 10) # 定义源领域模型 input_tensor = Input(shape=(28*28,)) x = Dense(256, activation='relu')(input_tensor) x = Dense(256, activation='relu')(x) output_tensor = Dense(10, activation='softmax')(x) source_model = Model(inputs=input_tensor, outputs=output_tensor) source_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # 在MNIST数据集上训练源领域模型 source_model.fit(mnist_train_images, mnist_train_labels, epochs=10, batch_size=128, validation_data=(mnist_test_images, mnist_test_labels)) # 定义领域适应模型 feature_extractor = Model(inputs=source_model.input, outputs=source_model.layers[-2].output) target_input = Input(shape=(28*28,)) target_features = feature_extractor(target_input) target_output = Dense(10, activation='softmax')(target_features) domain_adapt_model = Model(inputs=target_input, outputs=target_output) domain_adapt_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # 在USPS数据集上微调领域适应模型 domain_adapt_model.fit(usps_train_images, usps_train_labels, epochs=10, batch_size=128, validation_data=(usps_test_images, usps_test_labels)) # 评估领域适应模型 test_loss, test_acc = domain_adapt_model.evaluate(usps_test_images, usps_test_labels) print(f'领域适应模型在USPS测试集上的准确率: {test_acc}')```解释代码
03-19
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值