numpy手写BP神经网络——分类问题

原创于 2024-01-05 20:32:10 发布 · 2.1k 阅读

35 ·

CC 4.0 BY-SA版权

文章标签：

#numpy #神经网络 #分类 #pandas #机器学习

前言

善始者繁多，克终者盖寡。

一、问题描述

在numpy手写BP神经网络中我们构建了一个形状为“5 *10 *10 * 5”的BP神经网络，该模型包含2个隐含层，并且使用“独热编码”方式实现了多分类，但是分类效果不佳，其预测准确率仅为27%。

1.1 模型预测准确率不高的原因

资源文件已上传，可下载使用。
我们希望通过“ABCDE”五项指标测量被试的编程能力，对数据进行归一化后，通过熵权法、离差最大化法求出各项指标的权重和，使用组合赋权法得到了每位被试的综合评分，现依据被试的综合评分为其编程能力划分等级。等级描述如图所示：
在这里插入图片描述
在numpy手写BP神经网络中我们构建的BP神经网络如图所示，我们希望输入5项指标的数据后直接返回被试的编程能力等级，但是，我们构建的51010*5的预测模型有误，它将连续性的数据转换为了离散型数据。
在这里插入图片描述

以等级“弱”为例，当被试综合得分在0-0.2之间时，我们认为被试的编程能力弱，但是在上述510105的预测模型中，被试输出为[1,0,0,0,0]时才被认为是编程能力弱；
以等级“一般”为例，当被试综合得分在0.4-0.6之间时，我们认为被试的编程能力一般，但是在上述510105的预测模型中，被试输出为[0,0,1,0,0]时才被认为是编程能力一般。

综上所述，模型预测效果差的原因是：BP神经网络模型构建的不对！！！

1.2 解决方案

对于连续型数据的分类问题，BP神经网络的输出层只需1个神经元即可，先由BP神经网络输出被试的综合得分再对其进行分类。
也就是说我们将连续型数据的分类问题转换为了回归+分类的问题。在1.1神经网络模型的基础上，删减输出层神经元的个数，构建如图所示的神经网络模型：
在这里插入图片描述

二、python代码

2.1 BP神经网络工作流程

BP神经神经网络工作时主要有四个步骤，详细信息参照numpy手写BP神经网络。

前向传播-》计算误差-》后向传播-》-更新权重

2.2 初始化参数

    '''
        input,hidden,output分别表示输入层、隐含层、输出层神经元的个数
    '''
    def __init__(self,input,hidden,output):
        self.weight1 = numpy.random.randn(input,hidden)
        self.weight2 = numpy.random.randn(hidden,hidden)
        self.weight3 = numpy.random.randn(hidden,output)
        #准确度，训练后预测正确数目与样本总数之比
        self.accuracy = []
        #精确度，对训练结果而言，模型正确预测某一类别的样本数与模型预测为该类的样本数之比
        self.precision = []
        #召回率，对原始样本而言，样本中某个类别有多少被正确预测了
        self.recall = []
        #损失值
        self.loss = []

2.3 前向传播

2.3.1 激活函数-sigmod

在隐含层和输出层均使用sigmod激活函数。sigmod函数用于前向传播，公式为：
在这里插入图片描述

    #sigmod激活函数
    def sigmod(self,x):
        return 1/(1 + numpy.exp(-x))

2.3.2 前向传播代码

注意：在前向传播中隐含层、输出层均使用sigmod激活函数！！！

    #前向传播
    def forward(self, data):
        #存储每一层的输入和输出
        self.hidden1_input = numpy.dot(data, self.weight1)
        self.hidden1_output = self.sigmod(self.hidden1_input)

        self.hidden2_input = numpy.dot(self.hidden1_output,self.weight2)
        self.hidden2_output = self.sigmod(self.hidden2_input)

        self.output_input = numpy.dot(self.hidden2_output,self.weight3)
        self.output_output = self.sigmod(self.output_input)
        return self.output_output

2.4 反向传播（最重要步骤）

2.4.1 激活函数sigmod导数

反向传播包含了2.1中“计算误差-》后向传播-》-更新权重”三个操作，sigmod函数导数用于后向传播，公式为：

在这里插入图片描述

    #sigmod函数的导数
    def sigmod_derivative(self, x):
        return x * (1 - x)

2.4.2 损失函数-方差

损失函数反映了模型实际输出值与真实值之间的差异，根据经验，使用方差作为1.2中BO神经网络的损失函数，方差公式为：
在这里插入图片描述

y表示数据中某记录的真实值（标签）；
p表示模型对某记录的输出值（实际值/预测值）。

    #使用均方差作为损失函数
    def loss_mse(self,x,y):
        return 1/2*numpy.sum((x-y)*(x-y))

2.4.3 反向传播代码

    #后向传播
    def backward(self, data, label, learning_ration):
        #首先计算误差(损失)，交叉熵的导函数
        output_error = self.output_output - label
        #输出层误差项（包含了误差、激活函数导数两部分信息）
        output_delta = output_error * self.sigmod_derivative(self.output_output)
        #将输出层的误差传入隐藏层2
        hidden2_error = numpy.dot(output_delta,self.weight3.T) * self.sigmod_derivative(self.hidden2_output)
        #将隐藏层2的误差传入隐藏层1
        hidden1_error = numpy.dot(hidden2_error,self.weight2.T) * self.sigmod_derivative(self.hidden1_output)

        #三层误差已经得出，可以开始更新权重了
        self.weight1 -= numpy.dot(data.T,hidden1_error) * learning_ration
        self.weight2 -= numpy.dot(self.hidden1_output.T, hidden2_error) * learning_ration
        self.weight3 -= numpy.dot(self.hidden2_output.T, output_error) * learning_ration

2.5 训练模型

训练模型实际上就是重复执行前向传播、后向传播，以获取最优的权重值（此模型中未引入偏置）。在每执行一次“前向传播+后向传播”的同时，记录下此时模型的损失值（通过损失函数求得）和预测准确率。

    #训练数据集
    def train(self,data,label,learning_ration,epoch):
        for i in range(epoch):
            output = self.forward(data)
            self.backward(data, label, learning_ration)

            loss = self.loss_mse(label,output)
            # loss = self.loss_cross_entropy(label, output)
            self.loss.append(loss)

            accuary = self.caculate_accuracy_primal(output,label)
            self.accuracy.append(accuary)
            # print("accuary:",accuary)
            # self.show_weights()

2.6 预测结果

训练完毕后的模型各参数已经确定，“预测”就是执行一次前向传播。

    def predict(self,data):
        return self.forward(data)

2.7 损失曲线与准确率曲线

损失值与准确率均在2.4反向传播中计算出来，使用matplotlib绘制图像即可。

    def caculate_accuracy_primal(self,actual_label,label):
        actual_label = actual_label.tolist()
        label = label.tolist()
        true_count = 0
        size = len(label)
        for i in range(size):
            al = float(actual_label[i][0])
            l = float(label[i][0])
            # print(f"al is {al} l is {l}")
            if al>=0.7 and l>=0.7:
                true_count+=1
            if al>=0.6 and al<0.7 and l>=0.6 and l<0.7:
                true_count+=1
            if al>=0.4 and al<0.6 and l>=0.4 and l<0.6:
                true_count+=1
            if al>=0.2 and al<0.4 and l>=0.2 and l<0.4:
                true_count+=1
            if al>=0.0 and al<0.2 and l>=0.0 and l<0.2:
                true_count+=1
            # print(f"正确个数为{true_count},总个数为{size}")
        return true_count / size

    def show_loss(self):
        # print(self.loss)
        pyplot.title("LOSS")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.loss)
        pyplot.show()

    def show_accuracy(self):
        # print(self.loss)
        pyplot.title("Accuaracy")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.accuracy)
        pyplot.show()

注意：上述代码准确率依据1.1中编程能力划分等级比较模型输出值（实际值）和真实值（标签）得出，具体问题具体分析！！！
在这里插入图片描述

三、程序测试

3.1 加载数据

数据集中前10条记录如图所示：
在这里插入图片描述

def load_data_primal():
    df = pandas.read_excel("data2.xlsx")
    data_temp  = df[["A","B","C","D","E"]]
    label_temp = df["最终得分"]
    data = []
    label = []
    for i in range(df.shape[0]):
        data.append(data_temp.iloc[i].to_list())
        temp = []
        temp.append(label_temp[i])
        label.append(temp)
    data = numpy.array(data)
    label = numpy.array(label)
    return data,label

我们可以更加直观的看看表格数据在python中的表现形式，下图为前10记录的输入，使用二维数组存储。
在这里插入图片描述下图为前10条记录的期望输出，同样使用二维数组存储。

3.2 划分数据集并启动模型

数据集的划分参照numpy手写BP神经网络，本例中样本较少，将前75%作为训练集用于训练，将后25%作为测试集用于验证。

def mytest():
    data, label = load_data_primal()
    # print(data[0:10],"\n",label[0:10])
    # 划分训练集与测试集
    data_train = data[0:int(len(data) * 3 / 4)]
    data_train_train = data_train[0:int(len(data_train) * 3 / 4)]
    data_train_test = data_train[int(len(data_train) * 1 / 4) * (-1):-1]
    data_test = data[int(len(data) * 1 / 4) * (-1):-1]

    label_train = label[0:int(len(label) * 3 / 4)]
    label_train_train = label_train[0:int(len(data_train) * 3 / 4)]
    label_train_test = label_train[int(len(data_train) * 1 / 4) * (-1):-1]
    label_test = label[int(len(label) * 1 / 4) * (-1):-1]

    # 创建一个包含两个隐含层的BP神经网络
    network = BPNet_one_output(5, 10, 1)
    # 训练模型
    network.train(data_train, label_train, 0.01, 10000)
    network.show_loss()
    network.show_accuracy()
    # 预测结果
    result = network.predict(data_test)
    print(result)
    print(label_test)
    acc = network.caculate_accuracy_primal(result,label_test)
    print("准确率是{:.2f}%".format(acc*100))

3.3 模型效果分析

3.3.1 损失曲线

随着训练的进行，损失函数逐渐减小。
在这里插入图片描述

3.3.2 准确率曲线

随着训练的进行，模型在训练集上的预测准确率逐渐提高，最终接近100%。
在这里插入图片描述

3.3.3 预测准确率

该模型在验证集上的预测准确率为90.91%，当然，因为初始权重是随机设置的，多次执行得到的结果不一定相同。
在这里插入图片描述

四、完整代码

import numpy
import pandas
from matplotlib import pyplot
'''
    构建一个包含两个隐含层的BP神经网络
'''
class BPNet_one_output:
    '''
        input,hidden,output分别表示输入层、隐含层、输出层神经元的个数
    '''
    def __init__(self,input,hidden,output):
        self.weight1 = numpy.random.randn(input,hidden)
        self.weight2 = numpy.random.randn(hidden,hidden)
        self.weight3 = numpy.random.randn(hidden,output)
        #准确度，训练后预测正确数目与样本总数之比
        self.accuracy = []
        #精确度，对训练结果而言，模型正确预测某一类别的样本数与模型预测为该类的样本数之比
        self.precision = []
        #召回率，对原始样本而言，样本中某个类别有多少被正确预测了
        self.recall = []
        #损失值
        self.loss = []

    #比较两个列表是否相同
    def compare(self,list1:list,list2:list):
        if len(list1)!= len(list2):
            return
        for i in range(len(list1)):
            if list1[i]!=list2[i]:
                return 0
        return 1

    def caculate_accuracy(self,actual_label,label):
        true_count = 0
        false_count = 0
        result = []
        for i in range(len(actual_label)):
            # 将numpy.ndarray转换为普通的List
            temp = self.one_hot_encoding(actual_label[i])
            result.append(temp)
        actual_label = result
        size = len(label)
        for i in range(size):
            rs = self.compare(actual_label[i],label[i])
            if rs==1:
                true_count += 1
            else:
                false_count += 1
        # print(f"正确个数为{true_count},总个数为{size}")
        return true_count/size

    def caculate_accuracy_primal(self,actual_label,label):
        actual_label = actual_label.tolist()
        label = label.tolist()
        true_count = 0
        size = len(label)
        for i in range(size):
            al = float(actual_label[i][0])
            l = float(label[i][0])
            # print(f"al is {al} l is {l}")
            if al>=0.7 and l>=0.7:
                true_count+=1
            if al>=0.6 and al<0.7 and l>=0.6 and l<0.7:
                true_count+=1
            if al>=0.4 and al<0.6 and l>=0.4 and l<0.6:
                true_count+=1
            if al>=0.2 and al<0.4 and l>=0.2 and l<0.4:
                true_count+=1
            if al>=0.0 and al<0.2 and l>=0.0 and l<0.2:
                true_count+=1
            # print(f"正确个数为{true_count},总个数为{size}")
        return true_count / size

    def one_hot_encoding(self,data:list):
        max = data[0]
        max_index = 0
        for i in range(len(data)):
            if data[i]>max:
                max = data[i]
                max_index = i
        for i in range(len(data)):
            if i==max_index:
                data[i]=1
            else:
                data[i]=0
        return data

    #sigmod激活函数
    def sigmod(self,x):
        return 1/(1 + numpy.exp(-x))

    #sigmod函数的导数
    def sigmod_derivative(self, x):
        return x * (1 - x)

    #softmax激活函数
    def softmax(self,x):
        #按行计算每一个样本
        exps = numpy.exp(x - numpy.max(x,axis=1,keepdims=True))
        #为避免指数溢出numpy能够表示的上限，使其减去当前数据中的最大值
        return exps/numpy.sum(exps,axis=1,keepdims=True)

    def loss_cross_entropy(self,y,p):
        '''
        :param y: 真实标签
        :param p: 预测标签
        :return: 交叉熵
        '''
        #为了避免出现log(0)的情况，计算时加上一个极小值
        min_data = 1e-60
        # return -1 * numpy.sum(y*numpy.log(p+min_data))
        return -numpy.mean(y*numpy.log(p+min_data))

    def loss_cross_entropy_derivative(self,label_true,label_predict):
        return label_true - label_predict

    #使用均方差作为损失函数
    def loss_mse(self,x,y):
        return 1/2*numpy.sum((x-y)*(x-y))

    #前向传播
    def forward(self, data):
        #存储每一层的输入和输出
        self.hidden1_input = numpy.dot(data, self.weight1)
        self.hidden1_output = self.sigmod(self.hidden1_input)

        self.hidden2_input = numpy.dot(self.hidden1_output,self.weight2)
        self.hidden2_output = self.sigmod(self.hidden2_input)

        self.output_input = numpy.dot(self.hidden2_output,self.weight3)
        self.output_output = self.sigmod(self.output_input)
        return self.output_output

    #后向传播
    def backward(self, data, label, learning_ration):
        #首先计算误差(损失)，交叉熵的导函数
        output_error = self.output_output - label
        #输出层误差项（包含了误差、激活函数导数两部分信息）
        output_delta = output_error * self.sigmod_derivative(self.output_output)
        #将输出层的误差传入隐藏层2
        hidden2_error = numpy.dot(output_delta,self.weight3.T) * self.sigmod_derivative(self.hidden2_output)
        #将隐藏层2的误差传入隐藏层1
        hidden1_error = numpy.dot(hidden2_error,self.weight2.T) * self.sigmod_derivative(self.hidden1_output)

        #三层误差已经得出，可以开始更新权重了
        self.weight1 -= numpy.dot(data.T,hidden1_error) * learning_ration
        self.weight2 -= numpy.dot(self.hidden1_output.T, hidden2_error) * learning_ration
        self.weight3 -= numpy.dot(self.hidden2_output.T, output_error) * learning_ration

    #训练数据集
    def train(self,data,label,learning_ration,epoch):
        for i in range(epoch):
            output = self.forward(data)
            self.backward(data, label, learning_ration)

            loss = self.loss_mse(label, output)
            # loss = self.loss_cross_entropy(label, output)
            self.loss.append(loss)
            accuary = self.caculate_accuracy_primal(output, label)
            self.accuracy.append(accuary)
            # print("accuary:",accuary)
            # self.show_weights()

    #使用训练好的数据预测结果
    def predict_ont_hot(self,data:list):
        data = self.predict(data)
        result = []
        for i in range(len(data)):
            #将numpy.ndarray转换为普通的List
            temp = self.one_hot_encoding(data[i].tolist())
            result.append(temp)
        return result

    def predict(self,data):
        return self.forward(data)

    def show_weights(self):
        print(f"{self.weight1}\n{self.weight2}\n{self.weight3}")

    def show_loss(self):
        # print(self.loss)
        pyplot.title("LOSS")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.loss)
        pyplot.show()

    def show_accuracy(self):
        # print(self.loss)
        pyplot.title("Accuaracy")
        pyplot.xlabel("epoch")
        pyplot.ylabel("ration")
        pyplot.plot(self.accuracy)
        pyplot.show()

def load_data():
    df = pandas.read_excel("data2.xlsx")
    data_temp  = df[["A","B","C","D","E"]]
    label_temp = df["评级分"]
    data = []
    label = []
    DIMENSION = len(data_temp.columns)
    for i in range(df.shape[0]):
        data.append(data_temp.iloc[i].to_list())
        #对训练集标签进行独热编码
        temp = []
        for j in range(DIMENSION):
            temp.append(0)
        index = label_temp[i]-1
        temp[index] = 1
        # temp.append(label_temp[i])
        label.append(temp)
    data = numpy.array(data)
    label = numpy.array(label)
    return data,label

def load_data_primal():
    df = pandas.read_excel("data2.xlsx")
    data_temp  = df[["A","B","C","D","E"]]
    label_temp = df["最终得分"]
    data = []
    label = []
    for i in range(df.shape[0]):
        data.append(data_temp.iloc[i].to_list())
        temp = []
        temp.append(label_temp[i])
        label.append(temp)
    data = numpy.array(data)
    label = numpy.array(label)
    return data,label

def test():
    # 创建训练数据集
    X_train = numpy.array([[0, 0],
                           [0, 1],
                           [1, 0],
                           [1, 1]])
    y_train = numpy.array([[0],
                           [1],
                           [1],
                           [0]])

    # 创建测试数据集
    X_test = numpy.array([[0, 0],
                          [0, 1],
                          [1, 0],
                          [1, 1]])
    y_test = numpy.array([[0],
                          [1],
                          [1],
                          [0]])
    learning_ration = 0.01
    network = BPNet_one_output(2, 10, 1)
    network.train(X_train, y_train, learning_ration, 50000)
    print(network.predict(X_test))
    network.show_loss()

def mytest():
    data, label = load_data_primal()
    # print(data[0:10],"\n",label[0:10])
    # 划分训练集与测试集
    data_train = data[0:int(len(data) * 3 / 4)]
    data_train_train = data_train[0:int(len(data_train) * 3 / 4)]
    data_train_test = data_train[int(len(data_train) * 1 / 4) * (-1):-1]
    data_test = data[int(len(data) * 1 / 4) * (-1):-1]

    label_train = label[0:int(len(label) * 3 / 4)]
    label_train_train = label_train[0:int(len(data_train) * 3 / 4)]
    label_train_test = label_train[int(len(data_train) * 1 / 4) * (-1):-1]
    label_test = label[int(len(label) * 1 / 4) * (-1):-1]

    # 创建一个包含两个隐含层的BP神经网络
    network = BPNet_one_output(5, 10, 1)
    # 训练模型
    network.train(data_train, label_train, 0.01, 10000)
    network.show_loss()
    network.show_accuracy()
    # 预测结果
    result = network.predict(data_test)
    print(result)
    print(label_test)
    acc = network.caculate_accuracy_primal(result,label_test)
    print("准确率是{:.2f}%".format(acc*100))

if __name__ == '__main__':
    mytest()

五、其他问题

在训练过程中可能出现以下两种情况，损失曲线并没像我们期望的那样随着训练的进行而减少。出现以下问题的原因有两个：

①损失函数选用不合适。（选用了交叉熵函数）
②模型训练顺序不正确。（在train方法中）

5.1 损失曲线呈现为一条水平直线

在这里插入图片描述

5.2 损失曲线不降反增

在这里插入图片描述

5.3 更正方法

①调整train方法中代码顺序；
②将train方法中的损失函数改为“方差”。

    #训练数据集
    def train(self,data,label,learning_ration,epoch):
        for i in range(epoch):
            output = self.forward(data)
            self.backward(data, label, learning_ration)

            loss = self.loss_mse(label, output)
            # loss = self.loss_cross_entropy(label, output)
            self.loss.append(loss)
            accuary = self.caculate_accuracy_primal(output, label)
            self.accuracy.append(accuary)
            # print("accuary:",accuary)
            # self.show_weights()

5 条评论

wrz12345678 2024.08.05
大佬可以求一下那个xlsx文件的数据吗
- wrz12345678回复进击的墨菲特 2024.08.06
  成功了谢谢大佬
- 进击的墨菲特回复wrz12345678 2024.08.05
  昂，不好意思啊，这过好久了我不记得这个文件在哪了，你看参照“加载数据”那一节在excel中使用random随机生成数据，只要列数对应上了程序就可以正常运行

秃头的张张 2024.06.13
博主，您好，请问单纯的bp回归（8个自变量、1个因变量）以上代码需要怎么修改呀

优快云-Ada助手 2024.01.06
恭喜您写了第13篇博客！看到您探讨了numpy手写BP神经网络来解决分类问题，我感到非常欣慰。您的文章对于那些想要了解神经网络以及如何应用它们进行分类的读者来说肯定是非常有帮助的。针对下一步的创作建议，我谨提几点供您参考。首先，您可以考虑增加一些实际案例或者示例代码，以便读者更好地理解和应用您所讲解的概念。此外，如果有可能，您可以探索并介绍一些其他常用的分类算法，以便读者对比和选择最适合自己需求的方法。总之，您的博客已经具备了很高的实用性和教育性。期待您在未来的创作中能继续保持谦逊的态度，为读者带来更多有价值的内容。加油！

numpy手写BP神经网络——分类问题

文章目录

前言

一、问题描述

1.1 模型预测准确率不高的原因

1.2 解决方案

二、python代码

2.1 BP神经网络工作流程

2.2 初始化参数

2.3 前向传播

2.3.1 激活函数-sigmod

2.3.2 前向传播代码

2.4 反向传播（最重要步骤）

2.4.1 激活函数sigmod导数

2.4.2 损失函数-方差

2.4.3 反向传播代码

2.5 训练模型

2.6 预测结果

2.7 损失曲线与准确率曲线

三、程序测试

3.1 加载数据

3.2 划分数据集并启动模型

3.3 模型效果分析

3.3.1 损失曲线

3.3.2 准确率曲线

3.3.3 预测准确率

四、完整代码

五、其他问题

5.1 损失曲线呈现为一条水平直线

5.2 损失曲线不降反增

5.3 更正方法

5 条评论