李飞飞计算机视觉课程学习笔记第二章_计算机视觉作业 project training a new detector-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_35494379/article/details/105517250

2 图像分类方法

作业题：https://cs231n.github.io/assignments2018/assignment1/
工具：Python + Numpy
Numpy的相关资料：Python之Numpy详细教程

2.1 数据驱动方法

图像分类算法是需要大量的数据，是数据驱动的方法。
在这里插入图片描述
数据集：CIFAR-10 50000训练图片 32323 10000测试图片

第一个分类器：最邻近算法（Nearest Neighbor）

训练时记录所有的数据和标签，预测时，找到与待预测的数据的最相似的训练图片，其标签就是待预测的数据的标签。
在这里插入图片描述
**临近的标准：**L1距离和L2距离

代码如下：

import nmpy as ap
class NearestNeighbor:
	def __init__(self):
		pass
	def train(self,X,y):
	#存储训练数据
		self.Xtr = X
		self.ytr = y
		
	def predict(self,X):
		num_test = X.shape[0]
		Ypred = np.zeros(num_test, dtpye = self.ytr.dtype)
		for i in range(nm_test):
			diatance = np.sum(np.abs(self.Xtr - X[i,:]), axis =1)
			min_index = np.argmin(distance)
			Ypred[i] = self.ytr[min_index]
		return Ypred

在这里插入图片描述

2.2 KNN

NN算法对噪声的鲁棒性太差，所以有了KNN算法，选取最临近的K个点，采用投票的方式决定待预测数据的标签。K的值越大，边界越平滑。
在这里插入图片描述
超参： K和最佳距离

KNN总结： KNN不适用于图像分类，没有考虑到图像像素之间的距离的信息，而且存在维数的灾难问题。

2.3 线性分类器

神经网络比作玩乐高，可以将不同种类组件组合在一起，构建大型的卷积神经网络，线性分类器是基本的组件。
在这里插入图片描述
x是一个长向量，b是偏置项，它不与训练集数据交互，而只会给我们一些数据独立的偏好值。例如，当数据集中猫的数量多于狗时，则与猫对应的偏差元素就会比其他的高。

线性分类器是一种模板匹配的方法，线性分类器只允许每个类别只能学习一个模板，如果这个类别出现了某种类型的变体，那么它将尝试求去所有不同变体的均值，并且只使用一个单独的模板来识别其中的每一个类别。对于神经网络和其他更复杂的模型，没有以上的限制，因此能达到更高的准确率。
在这里插入图片描述
线性分类器的另一个观点是回归到图像，作为点和高维空间的概念。它在先行决策边界上尝试画一个线性分类面来决策类别

线性分类过程中无法解决的问题：
在这里插入图片描述

奇数偶数划分问题
多分类问题
有多模态数据，例如一个类别出现在不同的领域空间中
例子：

2.4 作业

KNN

import numpy as np


class KNearestNeighbor(object):
    """ a kNN classifier with L2 distance """

    def __init__(self):
        pass

    def train(self, X, y):
        """
        Train the classifier. For k-nearest neighbors this is just
        memorizing the training data.

        Inputs:
        - X: A numpy array of shape (num_train, D) containing the training data
          consisting of num_train samples each of dimension D.
        - y: A numpy array of shape (N,) containing the training labels, where
             y[i] is the label for X[i].
        """
        self.X_train = X
        self.y_train = y

    def predict(self, X, k=1, num_loops=0):
        """
        Predict labels for test data using this classifier.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data consisting
             of num_test samples each of dimension D.
        - k: The number of nearest neighbors that vote for the predicted labels.
        - num_loops: Determines which implementation to use to compute distances
          between training points and testing points.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        if num_loops == 0:
            dists = self.compute_distances_no_loops(X)
        elif num_loops == 1:
            dists = self.compute_distances_one_loop(X)
        elif num_loops == 2:
            dists = self.compute_distances_two_loops(X)
        else:
            raise ValueError('Invalid value %d for num_loops' % num_loops)

        return self.predict_labels(dists, k=k)

    def compute_distances_two_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a nested loop over both the training data and the
        test data.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data.

        Returns:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          is the Euclidean distance between the ith test point and the jth training
          point.
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            for j in range(num_train):
                #####################################################################
                # TODO:                                                             #
                # Compute the l2 distance between the ith test point and the jth    #
                # training point, and store the result in dists[i, j]. You should   #
                # not use a loop over dimension.                                    #
                #####################################################################
                #相减，平方，求和，开方
                dis = X[i,:]-self.X_train[j,:]
                distance = np.sqrt(np.sum(dis**2))
                dists[i,j]=distance
                #####################################################################
                #                       END OF YOUR CODE                            #
                #####################################################################
        return dists

    def compute_distances_one_loop(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a single loop over the test data.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            #######################################################################
            # TODO:                                                               #
            # Compute the l2 distance between the ith test point and all training #
            # points, and store the result in dists[i, :].                        #
            #######################################################################
            dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis=1))
            #######################################################################
            #                         END OF YOUR CODE                            #
            #######################################################################
        return dists

    def compute_distances_no_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using no explicit loops.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        #########################################################################
        # TODO:                                                                 #
        # Compute the l2 distance between all test points and all training      #
        # points without using any explicit loops, and store the result in      #
        # dists.                                                                #
        #                                                                       #
        # You should implement this function using only basic array operations; #
        # in particular you should not use functions from scipy.                #
        #                                                                       #
        # HINT: Try to formulate th	e l2 distance using matrix multiplication    #
        #       and two broadcast sums.                                         #
        #########################################################################
        sq_train = np.sum(np.square(self.X_train), axis=1)
        sq_test = np.sum(np.square(X), axis=1)
        mul = np.multiply(np.matmul(X, self.X_train.T), -2)
        sq_train = np.reshape(sq_train, (1, sq_train.shape[0]))
        sq_test = np.reshape(sq_test, (sq_test.shape[0], 1))
        dists = mul + sq_train + sq_test
        dists = np.sqrt(dists)
        #########################################################################
        #                         END OF YOUR CODE                              #
        #########################################################################
        return dists

    def predict_labels(self, dists, k=1):
        """
        Given a matrix of distances between test points and training points,
        predict a label for each test point.

        Inputs:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          gives the distance betwen the ith test point and the jth training point.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        num_test = dists.shape[0]
        y_pred = np.zeros(num_test)
        for i in range(num_test):
            # A list of length k storing the labels of the k nearest neighbors to
            # the ith test point.
            closest_y = []
            #########################################################################
            # TODO:                                                                 #
            # Use the distance matrix to find the k nearest neighbors of the ith    #
            # testing point, and use self.y_train to find the labels of these       #
            # neighbors. Store these labels in closest_y.                           #
            # Hint: Look up the function numpy.argsort.                             #
            #########################################################################
            #对距离进行排序
            idx = np.argsort(dists[i, :], axis=0)
            #########################################################################
            # TODO:                                                                 #
            # Now that you have found the labels of the k nearest neighbors, you    #
            # need to find the most common label in the list closest_y of labels.   #
            # Store this label in y_pred[i]. Break ties by choosing the smaller     #
            # label.                                                                #
            #########################################################################
            #选取前k个标签
            closest_y = self.y_train[idx[: k]]
            #求前k个标签中出现次数最多的一个
            y_pred[i] = np.argmax(np.bincount(closest_y))
            #########################################################################
            #                           END OF YOUR CODE                            #
            #########################################################################

        return y_pred