install python-mnist_python – 深度学习Udacity课程：Prob 2作业1(不是MNIST)

最新推荐文章于 2024-08-07 10:22:19 发布

原创最新推荐文章于 2024-08-07 10:22:19 发布 · 106 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#install python-mnist

本文探讨了如何使用matplotlib从notMnist数据集中随机选取样本并进行可视化展示，旨在验证数据集的质量及正确性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

阅读

this并参加课程后,我正在努力解决作业1(

notMnist)中的第二个问题：

Let’s verify that the data still looks good. Displaying a sample of the labels and images from the ndarray. Hint: you can use matplotlib.pyplot.

这是我尝试过的：

import random

rand_smpl = [ train_datasets[i] for i in sorted(random.sample(xrange(len(train_datasets)), 1)) ]

print(rand_smpl)

filename = rand_smpl[0]

import pickle

loaded_pickle = pickle.load( open( filename, "r" ) )

image_size = 28 # Pixel width and height.

import numpy as np

dataset = np.ndarray(shape=(len(loaded_pickle), image_size, image_size),

dtype=np.float32)

import matplotlib.pyplot as plt

plt.plot(dataset[2])

plt.ylabel('some numbers')

plt.show()

但这就是我得到的：

这没有多大意义.说实话,我的代码也可能,因为我不确定如何解决这个问题！

泡菜是这样创建的：

image_size = 28 # Pixel width and height.

pixel_depth = 255.0 # Number of levels per pixel.

def load_letter(folder, min_num_images):

"""Load the data for a single letter label."""

image_files = os.listdir(folder)

dataset = np.ndarray(shape=(len(image_files), image_size, image_size),

dtype=np.float32)

print(folder)

num_images = 0

for image in image_files:

image_file = os.path.join(folder, image)

try:

image_data = (ndimage.imread(image_file).astype(float) -

pixel_depth / 2) / pixel_depth

if image_data.shape != (image_size, image_size):

raise Exception('Unexpected image shape: %s' % str(image_data.shape))

dataset[num_images, :, :] = image_data

num_images = num_images + 1

except IOError as e:

print('Could not read:', image_file, ':', e, '- it\'s ok, skipping.')

dataset = dataset[0:num_images, :, :]

if num_images < min_num_images:

raise Exception('Many fewer images than expected: %d < %d' %

(num_images, min_num_images))

print('Full dataset tensor:', dataset.shape)

print('Mean:', np.mean(dataset))

print('Standard deviation:', np.std(dataset))

return dataset

这个函数的调用方式如下：

dataset = load_letter(folder, min_num_images_per_class)

try:

with open(set_filename, 'wb') as f:

pickle.dump(dataset, f, pickle.HIGHEST_PROTOCOL)

这里的想法是：

Now let’s load the data in a more manageable format. Since, depending on your computer setup you might not be able to fit it all in memory, we’ll load each class into a separate dataset, store them on disk and curate them independently. Later we’ll merge them into a single dataset of manageable size.

We’ll convert the entire dataset into a 3D array (image index, x, y) of floating point values, normalized to have approximately zero mean and standard deviation ~0.5 to make training easier down the road.