嵌入式神经网络入门笔记-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_42992743/article/details/139006252

从一个最简单的多层感知器讲起

前言
一、数据集
二、模型创建及训练
预测

前言

多层感知器，是一种基本的神经网络结构，包含多个全连接层。一个简单的全连接网络模型大概长下面这样子，其中w1/w2/w3/w4/w5/w6叫权重，b1/b2b3叫偏置。用大白话讲，最后的输出就是

h1 = x1 * w1 + x2 * w3 + b1
h2 = x1 * w2 + x2 * w4 + b2
o1 = h1 * w5 + h2 * w6 + b3

够简单吧，就是无数的y=kx+b~~

全连接网络
然后呢，在神经网络里面还有个概念叫激活函数，一般放在每一个输出层之后，用来引入非线性特性，比如最常用的ReLU 函数，它就极其简单粗暴
ReLU(x)=max(0,x)
即只保留正数~~
它咋用呢？以上面这个模型为例，如果我们在隐藏层后面放了一个relu激活函数，那它的输出如下。
h1 = x1 * w1 + x2 * w3 + b1

h1 = h1 > 0 ? h1 : 0

h2 = x1 * w2 + x2 * w4 + b2

h2 = h2 > 0 ? h2 : 0

o1 = h1 * w5 + h2 * w6 + b3

无法就是一些加减乘除之类的算法嘛，电脑能跑，单片机肯定也能跑对吧，只不过资源和算力的问题。

先抛开嵌入式实现不谈，我们就从原理上来浅浅的验证一下上述。
接下来开始面向chatGPT编程了

我们来实现一个 y = x1 + x2 的神经网络网络模型，用数据集进行训练，并手动计算预测

一、数据集

在这之前我尝试用一个神经网络开源框架 NNOM（NNoM 是专门用于微控制器的高级推理神经网络库）来实现一个项目，做为嵌入式软件工程师，对神经网络只需要简单的应用和了解即可，就我目前的理解而言，应用神经网络最复杂重要的几个点大概在特征的选择、网络模型的选择、数据集的制作及量化。

ok，我们让chatGPT帮忙写一个python脚本，用于生成训练数据集

生成的数据集长下面这样子

附数据集生成脚本

import pandas as pd

# 创建空的 DataFrame
df = pd.DataFrame(columns=['in0', 'in1', 'out0'])

# 遍历范围为 -100 到 100 之间的整数
for in0 in range(-100, 101):
    for in1 in range(-100, 101):
        # 计算 out0，即 in0 和 in1 的和
        out0 = in0 + in1
        # 将数据添加到 DataFrame 中
        df.loc[len(df)] = [in0, in1, out0]

# 将 DataFrame 写入 Excel 文件
df.to_excel('output.xlsx', index=False)

除了这个，我们还需要读取这个excel，并将数据集分割成训练数据集和测试数据集
嗯,叫大哥帮我写了下，函数如下

def prepare_data_from_file(file_name, features, target, test_size=0.1, random_state=None):
    """
    从文件中加载数据并将其分割成训练集和测试集
    参数:
    file_name: str, 包含数据的文件名
    features: list, 特征列的名称列表
    target: str, 目标列的名称
    test_size: float, 测试集的比例，默认为0.2
    random_state: int or None, 随机种子，默认为 None
    返回:
    X_train: array, 训练集的特征
    X_test: array, 测试集的特征
    y_train: array, 训练集的目标
    y_test: array, 测试集的目标
    """
    df = pd.read_excel(file_name)
    X = df[features].values.astype(int)
    y = df[target].values.astype(int)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, random_state=random_state)
    return X_train, X_test, y_train, y_test

二、模型创建及训练

def create_model(input_dim):

    inputs = Input(shape=(input_dim,))

    x = Dense(4)(inputs)
    x = Activation('relu')(x)

    x = Dense(1)(x)

    output = x

    # 构建模型
    model = Model(inputs=inputs, outputs=output)

    return model

x_train, x_test, y_train, y_test = prepare_data_from_file(r'output.xlsx', ['in0', 'in1'], ['out0'], test_size=0.2)

model = create_model(2)

model.compile( optimizer=tf.keras.optimizers.Adam(),
              loss='mean_absolute_error')
model.summary()
history = model.fit(x_train, y_train,
                            batch_size=100,
                            epochs=50,
                            verbose=1,
                            validation_data=(x_test, y_test),
                            shuffle=True)
model.save('my_model.h5')
print("模型已保存.")

哦对，在这里还有一个概念，就是损失函数的选择，比如我们这个预测两个输入之和的模型，它的损失函数就是平均绝对误差，神经网络训练反正挺复杂的，大概目的就是，找一组权重和偏置，使得输出值和训练集的输出值之间的损失值最小。

运行上面脚本，我们会得到一个my_model.h5文件，这个文件就按一定格式保存了模型的各种信息，包括权重和偏置也在里面。
我们可以用python的h5py库去查看h5文件的信息，当然我们也可以直接用HDFView去查看

模型信息
我们还是用h5py库把权重偏置打印出来，脚本如下

import h5py
import numpy as np

# 打开H5文件
h5_file = h5py.File('my_model.h5', 'r')

# 打印文件结构（可选）
def print_structure(name, obj):
    print(name)

h5_file.visititems(print_structure)

# 获取dense层的权重和偏置
dense_weights = h5_file['model_weights/dense/dense/kernel:0'][:]
dense_biases = h5_file['model_weights/dense/dense/bias:0'][:]

# 获取dense层的权重和偏置
dense1_weights = h5_file['model_weights/dense_1/dense_1/kernel:0'][:]
dense1_biases = h5_file['model_weights/dense_1/dense_1/bias:0'][:]
print("Dense layer weights:")
print(dense_weights)

print("Dense layer biases:")
print(dense_biases)

print("Dense1 layer weights:")
print(dense1_weights)

print("Dense1 layer biases:")
print(dense1_biases)

h5_file.close()

Dense layer weights:

[[ 0.7100434 0.874805 0.6065492 -0.8424878 ]
[ 0.7099347 0.87465626 0.60641664 -0.8427926 ]]

Dense layer biases:

[-0.00402281 -0.0086698 0.21640915 -0.15554184]

Dense1 layer weights:

[[ 0.3898223 ]
[ 0.22344294]
[ 0.8708818 ]
[-1.1870881 ]]

Dense1 layer biases:

[-0.18469849]

好的，接下来我们来验证，直接用这些权重数据来计算出来的值，和用predict函数预测出来的值是否一致

预测

为了方便，我就直接在python平台验证了，当然用嵌入式平台也是没问题的

首先我们把上面训练的那个模型的每一层的权重和偏置写死出来。这里比较大的注意点就是要注意矩阵的形状~，线性代数这不就用上来了嘛？

题外话，如果是嵌入式平台，这么多的浮点数运算，那资源消耗得飞起，所以可以用定点数去算呀，只是损失一些精度，当然，翻译和量化权重文件想想就复杂~~

import numpy as np
# 权重和偏置
weights = np.array([
    [
        [ 0.7100434, 0.874805, 0.6065492, -0.8424878 ],
        [ 0.7099347, 0.87465626, 0.60641664, -0.8427926]
    ]
])

biases = np.array([-0.00402281, -0.0086698, 0.21640915, -0.15554184])

# 权重和偏置
weights1 = np.array([
    [
        [ 0.3898223 ],
        [ 0.22344294],
        [ 0.8708818 ],
        [-1.1870881 ]
    ]
])

biases1 = np.array([-0.18469849])

然后实现激活函数

# 激活函数（ReLU）
def relu(x):
    return np.maximum(0, x)

随便写两个数据验证一下

# 输入数据
input_data = np.array([-3.2, 9.4])

# 计算隐藏层的输出
hidden_output = np.dot(input_data, weights) + biases 

#过激活函数
hidden_output = relu(hidden_output)

output = np.dot(hidden_output, weights1) + biases1

print("output:",output)

输出结果
output: [[[6.20151997]]]

然后我们用tf自带的预测函数预测一下试试

import tensorflow as tf
# 加载模型
model = tf.keras.models.load_model('my_model.h5')
# 打印模型结构（可选）
model.summary()

# 输入数据
input_data = np.array([[-3.2, 9.4]])
# 进行预测
output_data = model.predict(input_data)

# 打印预测结果
print("Prediction output:"，output_data)

输出结果
Prediction output: [[6.2015195]]

嗯，数据有差异，估计是数据类型之类的导致的吧，不求甚解，实验是ok了~~~

呐，我们再用计算器来试试

D0 = x1 * 0.7100434 + x2 * 0.7099347 - 0.00402281 = 4.39722449
D1 = x1 * 0.874805 + x2 * 0.87465626 - 0.0086698 = 5.413723044
D2 = x1 * 0.6065492+ x2 * 0.60641664 + 0.21640915 = 3.975768126
D3 = x1 * -0.8427926 + x2 * -0.8427926 - 0.15554184 = -5.38085596

然后过relu激活函数之后

D0 = 4.39722449
D1 = 5.413723044
D2 = 3.975768126
D3 = 0

OUT = D0 * 0.3898223 + D1 * 0.22344294 + D2 * 0.8708818 + D3 * -1.1870881 - 0.18469849
= 1.714136164308127 + 1.20965819329710936 + 3.4624241019535068 - 0.18469849
= 6.20151996955874316

果然人类比AI还是聪明一点，算个9.4 - 3.2 = 6.2都搞得这么复杂~~~~