tensorflow初学之SGD

本文介绍了在TensorFlow中使用随机梯度下降(SGD)训练神经网络的过程,通过对比简单的梯度下降(GD),展示了SGD在训练速度上的优势。文章从检查包、导入数据、数据处理开始,逐步实现了一个简单的多层神经网络,并通过SGD进行训练,最终在测试集上取得了86.8%的准确率。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

在开始本任务之前,确保已经完成了之前的notMNIST的步骤,点击查看notMNIST

提示:训练随机梯度下降(SGD)花费的时间应给明显少于简单的梯度下降(GD).

1.检查包

首先,检查本次学习要用到的包,确保都已经正确导入,输入以下代码,点击“run cell”,运行不报错即可

#在学习之前确保以下包已经正确导入
from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import cPickle as pickle
from six.moves import range

2.导入pickle

导入之前生成的notMNIST.pickle文件(链接在本文开头)

pickle_file = 'notMNIST.pickle'

with open(pickle_file, 'rb') as f:
  save = pickle.load(f)
  train_dataset = save['train_dataset']
  train_labels = save['train_labels']
  valid_dataset = save['valid_dataset']
  valid_labels = save['valid_labels']
  test_dataset = save['test_dataset']
  test_labels = save['test_labels']
  del save  # hint to help gc free up memory
  print('Training set', train_dataset.shape, train_labels.shape)
  print('Validation set', valid_dataset.shape, valid_labels.shape)
  print('Test set', test_dataset.shape, test_labels.shape)

结果如下:

Training set (200000, 28, 28) (200000,)
Validation set (10000, 28, 28) (10000,)
Test set (18724, 28, 28) (18724,)

3.数据处理

将数据重新处理为更适合接下来模型学习的格式:

数据作为平面矩阵;标签用One-Hot编码处理.

image_size = 28
num_labels = 10

def reformat(dataset, labels):
  dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)
  # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...]
  labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)
  return dataset, labels
train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)

结果:

Training set (200000, 784) (200000, 10)
Validation set (10000, 784) (10000, 10)
Test set (10000, 784) (10000, 10)

4.简单的梯度下降(GD)

我们首先使用简单的梯度下降训练多项Logistic回归。

tensorflow工作流程:

首先描述你想要在输入,变量和操作上执行的计算方式,它们通过计算图的方式创建为一个节点。此描述全部包含在以下内容中:

with graph.as_default():

然后可以通过session.run(),在计算图上运行多次你想要的操作,从返回的图中提取提供输出。此运行时操作全部包含在以下块中:

with tf.Session(graph=graph) as session:

我们将所有数据加载到TensorFlow中,构建对应于我们培训的计算图:

# With gradient descent training, even this much data is prohibitive.
# Subset the training data for faster turnaround.
train_subset = 10000

graph = tf.Graph()
with graph.as_default():

  # Input data.
  # Load the training, validation and test data into constants that are
  # attached to the graph.
  tf_train_dataset = tf.constant(train_dataset[:train_subset, :])
  tf_train_labels = tf
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值