CS231n Assignment2 tensorflow.ipynb

最新推荐文章于 2024-06-15 23:42:06 发布

原创

最新推荐文章于 2024-06-15 23:42:06 发布 · 2.4k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#cs231n #tensorflow #assignment

本文是作者在学习CS231n课程时使用TensorFlow完成作业的记录，涵盖了创建虚拟环境、TensorFlow安装、卷积神经网络(CNN)的构建，包括7x7卷积层、ReLU激活、批量归一化、最大池化等，并探讨了不同的网络架构和优化策略。作者通过实践熟悉了TensorFlow的部分API接口，体验到深度学习框架的便捷性。

1. 前言

经过前期自己写全连接层(FC)，归一化层(Batch Normalization)，Relu层，Pool层，softmax以及svm的loss层，（还没有写dropout层，以及conv层），对Deeplearning有了更清晰的了解。

终于来到深度学习架构的学习，考虑到以后工作可能不是搞学术研究（偏学术的推荐pytorch，大佬推荐），选择了跨平台性好的tensorflow。

主要是写tensorflow.ipynb作业的学习记录，包括很多查阅的资料。

关于虚拟的python环境和tensorflow安装

创建虚拟python环境：

virtualenv -p python3 .env
source .env/bin/activate

所创建的虚拟环境只用了裸的python和编译器（即使系统里装了tensorflow也不会认识），所以可以在虚拟环境随便发挥，装任何需要的库。
安装tensorflow请参考安装tensorflow

2. 正文

import tensorflow as tf
import numpy as np
import math
import timeit
import matplotlib.pyplot as plt
%matplotlib inline

from cs231n.data_utils import load_CIFAR10

def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=10000):
    """
    Load the CIFAR-10 dataset from disk and perform preprocessing to prepare
    it for the two-layer neural net classifier. These are the same steps as
    we used for the SVM, but condensed to a single function.  
    """
    # Load the raw CIFAR-10 data
    cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

    # Subsample the data
    mask = range(num_training, num_training + num_validation)
    X_val = X_train[mask]
    y_val = y_train[mask]
    mask = range(num_training)
    X_train = X_train[mask]
    y_train = y_train[mask]
    mask = range(num_test)
    X_test = X_test[mask]
    y_test = y_test[mask]

    # Normalize the data: subtract the mean image
    mean_image = np.mean(X_train, axis=0)
    X_train -= mean_image
    X_val -= mean_image
    X_test -= mean_image

    return X_train, y_train, X_val, y_val, X_test, y_test


# Invoke the above function to get our data.
X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()
print('Train data shape: ', X_train.shape)
print('Train labels shape: ', y_train.shape)
print('Validation data shape: ', X_val.shape)
print('Validation labels shape: ', y_val.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

# clear old variables
tf.reset_default_graph()

# setup input (e.g. the data that changes every batch)
# The first dim is None, and gets sets automatically based on batch size fed in
X = tf.placeholder(tf.float32, [None, 32, 32, 3])
y = tf.placeholder(tf.int64, [None])
is_training = tf.placeholder(tf.bool)

def simple_model(X,y):
    # define our weights (e.g. init_two_layer_convnet)

    # setup variables
    Wconv1 = tf.get_variable("Wconv1", shape=[7, 7, 3, 32])
    bconv1 = tf.get_variable("bconv1", shape=[32])
    W1 = tf.get_variable("W1", shape=[5408, 10])
    b1 = tf.get_variable("b1", shape=[10])

    # define our graph (e.g. two_layer_convnet)
    a1 = tf.nn.conv2d(X, Wconv1, strides=[1,2,2,1], padding='VALID') + bconv1
    h1 = tf.nn.relu(a1)
    h1_flat = tf.reshape(h1,[-1,5408])# 13 * 13 * 32 = 5408
    y_out = tf.matmul(h1_flat,W1) + b1
    return y_out

y_out = simple_model(X,y)

# define our loss
total_loss = tf.losses.hinge_loss(tf.one_hot(y,10),logits=y_out)
mean_loss = tf.reduce_mean(total_loss)

# define our optimizer
optimizer = tf.train.AdamOptimizer(5e-4) # select optimizer and set learning rate
train_step = optimizer.minimize(mean_loss)

Comments

tf.nn.conv2d()
X,Wconv1,strides are Tensor，for example,Wconv1’s shape is [filter_height, filter_width, in_channels, out_channels]
strides[0],strides[3]=1
padding, two choices, [‘VALID’ ‘SAME’]
calculate output_width：(height is the same way)
‘VALID’: output_width = np.ceil((input_width - kernel + 1)/strides[1])
‘SAME’: output_width = np.ceil(input_width/strides[1])
tf.one_hot(y,10)
transform (N,) into (N,10).Rules: (i, y[i]) = 1, others = 0
tf.losses.hinge_loss()
just svm loss??

Reference

最低0.47元/天解锁文章