《Python神经网络编程》1.4节虫子分类器的实现

最新推荐文章于 2024-01-23 19:49:19 发布

原创最新推荐文章于 2024-01-23 19:49:19 发布 · 984 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#神经网络 #Python #机器学习 #分类器

Python 同时被 2 个专栏收录

2 篇文章

订阅专栏

1 篇文章

订阅专栏

本文详细介绍了基于《Python神经网络编程》一书构建虫子分类器的过程，通过模仿书中3层神经网络的实现，构建了一个可运行的虫子分类器。文章深入探讨了分类器的训练和查询机制，以及如何利用训练数据调整核心参数。

最近在看这本书，觉得里面虫子分类器也值得试试实现，因为这个方法已经包含了神经网络的核心思想。

以下是实现的过程。

按照《Python神经网络编程》(异步图书出版)第一章虫子分类器训练的过程，模仿书中第二章的3层神经网络的实现过程，来构建一个可运行的虫子分类器。

首先，构造出来分类器的框架，包含训练和查询.

In [ ]:

class BugClassifier:
    def __init__(self):
        pass
    
    def train(self, target_bug_size, target_bug):
        pass
    
    def query(self, input_bug_size):
        pass

虫子分类器，核心是一个线性函数，分类器的主要参数就是其斜率，同时也包括了训练过程中的学习率。由于书中已经预设了虫子和线形分类器的关系，所以需要把这个关系对应的修正量也预设好。例如瓢虫在线形函数下面，毛虫在上面。在查询时，也需要根据这个关系来做分类的结论。另外因此可以得到一个推论：该分类器并不需要就虫子和线形函数的位置关系进行学习（这算是已知的知识）。

In [ ]:

class BugClassifier:
    def __init__(self, coefficient, learning_rate, corrections):
        self.coefficient = coefficient # 初始系数
        self.learning_rate = learning_rate # 学习率
        self.corrections = corrections # 预定义的根据虫类的计算值修正量
    
    def train(self, target_bug_size, target_bug):
        pass
    
    def query(self, input_bug_size):
        pass
    
# 测试代码
bc = BugClassifier(0.25, 0.5, 
                   {'瓢虫':{'修正量':0.1, '关系':lambda c,t:c>t and True or False}, 
                    '毛虫':{'修正量':-0.1, '关系':lambda c,t:c<t and True or False}})

训练的过程，核心思路就是利用训练数据来计算，得到与目标之间的误差。再利用误差来反向传播到对核心参数(分类器所使用的线形函数的斜率系数)的调整，调整的步伐，则是依据初始化时设定的学习率。

In [6]:

class BugClassifier:
    def __init__(self, coefficient, learning_rate, corrections):
        self.coefficient = coefficient # 初始系数
        self.learning_rate = learning_rate # 学习率
        self.corrections = corrections # 预定义的根据虫类的计算值修正量
    
    def feedforward(self, width):
        ''' 前馈函数，此处是利用输入的宽度乘以分类器内部系数计算出来输出的长度 '''
        return width * self.coefficient
    
    def feedback(self, width, error):
        ''' 反馈函数，此处是利用输入的宽度和输出的误差来反向调节分类器内部系数 '''
        self.coefficient += self.learning_rate * (error / width)
        
    def train(self, target_bug_size, target_bug):
        target_bug_width = target_bug_size[0]
        target_bug_length = target_bug_size[1]
        computed_bug_length = self.feedforward(target_bug_width)
        # 误差计算。误差计算时，需要根据虫子分类和线形函数的预设关系来修正目标值
        error = (target_bug_length + self.corrections[target_bug]['修正量']) - computed_bug_length
        # 反馈误差
        self.feedback(target_bug_width, error)
    
    def query(self, input_bug_size):
        pass

# 测试代码
bc = BugClassifier(0.25, 0.5, 
                   {'瓢虫':{'修正量':0.1, '关系满足?':lambda c,t:c>t and True or False}, 
                    '毛虫':{'修正量':-0.1, '关系满足?':lambda c,t:c<t and True or False}})
bc.train([3.0, 1.0], '瓢虫')
print('训练第一个样本后的斜率：', bc.coefficient)
bc.train([1.0, 3.0], '毛虫')
print('训练第二个样本后的斜率：', bc.coefficient)

训练第一个样本后的斜率： 0.30833333333333335
训练第二个样本后的斜率： 1.6041666666666667

可以看到，两次训练的结果，和书上例子展示的结果一样，因此代码是正确的。

查询的过程，则是利用已经训练好的分类器，对比分类器计算的分界线值与目标值的大小，并结合预设的关系满足谓词，例如分界线值>目标值，则为瓢虫；分界线值<目标值，则为毛虫，来给出分类的结果。

In [13]:

class BugClassifier:
    def __init__(self, coefficient, learning_rate, corrections):
        self.coefficient = coefficient # 初始系数
        self.learning_rate = learning_rate # 学习率
        self.corrections = corrections # 预定义的根据虫类的计算值修正量
    
    def feedforward(self, width):
        ''' 前馈函数，此处是利用输入的宽度乘以分类器内部系数计算出来输出的长度 '''
        return width * self.coefficient
    
    def feedback(self, width, error):
        ''' 反馈函数，此处是利用输入的宽度和输出的误差来反向调节分类器内部系数 '''
        self.coefficient += self.learning_rate * (error / width)
        
    def train(self, target_bug_size, target_bug):
        target_bug_width = target_bug_size[0]
        target_bug_length = target_bug_size[1]
        computed_bug_length = self.feedforward(target_bug_width)
        # 误差计算。误差计算时，需要根据虫子分类和线形函数的预设关系来修正目标值
        error = (target_bug_length + self.corrections[target_bug]['修正量']) - computed_bug_length
        # 反馈误差
        self.feedback(target_bug_width, error)
    
    def query(self, input_bug_size):
        input_bug_width = input_bug_size[0]
        input_bug_length = input_bug_size[1]
        computed_bug_length = self.feedforward(input_bug_width)
        for bug, correction in self.corrections.items():
            if correction['关系满足?'](computed_bug_length, input_bug_length):
                return bug # 只有可能属于一种分类，因此立即返回

# 测试代码
bc = BugClassifier(0.25, 0.5, 
                   {'瓢虫':{'修正量':0.1, '关系满足?':lambda c,t:c>t and True or False}, 
                    '毛虫':{'修正量':-0.1, '关系满足?':lambda c,t:c<t and True or False}})
bc.train([3.0, 1.0], '瓢虫')
print('训练第一个样本后的斜率：', bc.coefficient)
bc.train([1.0, 3.0], '毛虫')
print('训练第二个样本后的斜率：', bc.coefficient)
test_case_1 = [2.8, 0.9]
print('输入：', test_case_1, ' 识别结果：', bc.query(test_case_1))

训练第一个样本后的斜率： 0.30833333333333335
训练第二个样本后的斜率： 1.6041666666666667
输入： [2.8, 0.9]  识别结果： 瓢虫

可以看到，分类成功了！