[cs231n] Assignment1-KNN

最新推荐文章于 2022-08-08 12:14:09 发布

原创

最新推荐文章于 2022-08-08 12:14:09 发布 · 330 阅读

0 ·

CC 4.0 BY-SA版权

算法思想

训练阶段: 分类器只保存下所有的成对训练数据
预测阶段: 分类器计算测试图片和所有训练数据的距离，选择K个最相近的图片，通过投票预测该图片的分类
交叉验证: 通过交叉验证选择最佳的K

加载数据集

# Load the raw CIFAR-10 data.
cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'

# Cleaning up variables to prevent loading data multiple times (which may cause memory issue)
try:
   del X_train, y_train
   del X_test, y_test
   print('Clear previously loaded data.')
except:
   pass

X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

# As a sanity check, we print out the size of the training and test data.
print('Training data shape: ', X_train.shape)
print('Training labels shape: ', y_train.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

输出:

Training data shape: (50000, 32, 32, 3)
Training labels shape: (50000,)
Test data shape: (10000, 32, 32, 3)
Test labels shape: (10000,)

训练数据是5万张32*32的图片

数据可视化

# Visualize some examples from the dataset.
# We show a few examples of training images from each class.
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y) # 类别为Y的下标
    idxs = np.random.choice(idxs, samples_per_class, replace=False) # 从中选择samples_per_class个
    for i, idx in enumerate(idxs):
        plt_idx = i * num_classes + y + 1
        plt.subplot(samples_per_class, num_classes, plt_idx)
        plt.imshow(X_train[idx].astype(

最低0.47元/天解锁文章