CS231n Lecture 2: Image Classification pipeline

最新推荐文章于 2022-07-24 17:38:28 发布

夏七岁

最新推荐文章于 2022-07-24 17:38:28 发布

阅读量755

点赞数

CC 4.0 BY-SA版权

文章标签： python cs2321n computer vision deep learning

本文链接：https://blog.youkuaiyun.com/weixin_42291486/article/details/90698274

Image Classification pipeline

Assignment
Image Classification pipeline
Code Achievement

Assignment

Image Classification pipeline

Nearest Classification
K-Nearest Neighbors(KNN)
Liner classification

Code Achievement

Working locally

Installing Anaconda
Anaconda Virtual environment

Once you have Anaconda installed, it makes sense to create a virtual environment for the course. If you choose not to use a virtual environment, it is up to you to make sure that all dependencies for the code are installed globally on your machine. To set up a virtual environment, run (in a terminal)

conda create -n cs231n python=3.7 anaconda

to create an environment called cs231n.
Then you can find Anaconda Prompt(cs231n) and Jupyter Notebook(cs231n) in your Anaconda FIle.
You may refer to this page for more detailed instructions on managing virtual environments with Anaconda.

Download Dataset

Once you have the starter code , you will need to download the CIFAR-10 dataset. Run the following from the assignment1 directory:

cd cs231n/datasets
./get_datasets.sh

for windows, get dataset from http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz ,then unzip to datasets

Start IPython

After you have the CIFAR-10 data, you should start the IPython notebook server from the assignment1 directory, with the jupyter notebook command.

Code Achievement

import modules

# Run some setup code for this notebook.

import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt

# This is a bit of magic to make matplotlib figures appear inline in the notebook
# rather than in a new window.
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Some more magic so that the notebook will reload external python modules;
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

load_CIFAR10 in data_utils.py

def load_CIFAR10(ROOT): 
    """ load all of cifar """
    xs = []
    ys = []
    for b in range(1,6):
        f = os.path.join(ROOT, 'data_batch_%d' % (b, )) #combine to filename
        X, Y = load_CIFAR_batch(f) # read f
        xs.append(X)
        ys.append(Y)
    Xtr = np.concatenate(xs)
    Ytr = np.concatenate(ys)
    del X, Y
    Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
    return Xtr, Ytr, Xte, Yte

def load_CIFAR_batch(filename):
    """ load single batch of cifar """
    with open(filename, 'rb') as f:
        datadict = load_pickle(f) #read & write in file
        X = datadict['data']
        Y = datadict['labels']
        X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float") #original data type; transpose:change position 0:10000 data num,1:3 channel,2&3:32*32 image size
        Y = np.array(Y)
        return X, Y

Load data

# Load the raw CIFAR-10 data.
cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'#it may be error in windows,you can change '/' to '\\' or use the other way to define the fileload as written below.
#import os
#cifar10_dir = os.path.join("cs231n","datasets","cifar-10-batches-py")


# Cleaning up variables to prevent loading data multiple times (which may cause memory issue)
try:
   del X_train, y_train
   del X_test, y_test
   print('Clear previously loaded data.')
except:
   pass

X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

# As a sanity check, we print out the size of the training and test data.
print('Training data shape: ', X_train.shape)
print('Training labels shape: ', y_train.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

out:
Training data shape:  (50000, 32, 32, 3)
Training labels shape:  (50000,)
Test data shape:  (10000, 32, 32, 3)
Test labels shape:  (10000,)

# Visualize some examples from the dataset.
# We show a few examples of training images from each class.
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y)
    idxs