《统计学习方法》感知机学习算法原始形式和对偶形式的python实现

最新推荐文章于 2022-07-22 22:45:54 发布

MrTimber

最新推荐文章于 2022-07-22 22:45:54 发布

阅读量511

点赞数

CC 4.0 BY-SA版权

分类专栏： log

本文链接：https://blog.youkuaiyun.com/m0_37567511/article/details/79963640

log 专栏收录该内容

9 篇文章

订阅专栏

本文介绍了一个基于Python实现的感知机学习算法，通过使用Pandas读取训练数据，并采用对偶形式进行模型训练，以减少计算复杂度。文章详细展示了如何通过调整权重和偏置来提高模型精度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

import os
import csv
import numpy as np
import string
import pandas as pd
import operator
import re as re
import time
import datetime


def perception_train(train_root, threshold = 0.9, lr = 1):
	train = pd.read_csv(train_root, header = None)
	train_label = np.mat(train)[:,1] #训练数据的label,列向量
	for i in range(np.shape(train_label)[0]):
		if train_label[i] == 0:
			train_label[i] = -1
	train_mat = np.mat(train)[:,2:]
	Gram = np.matmul(train_mat, np.transpose(train_mat)) #Gram matrix, 感知机学习算法的对偶形式
	sample_amount = np.shape(train_label)[0]
	weights = np.mat(np.zeros(np.shape(train_mat)[1])) #原始形式的权重，行向量
	alpha = np.mat(np.zeros(np.shape(train_mat)[0])) #对偶形式的权重体现，行向量
	bias = np.float64(0)

	print sample_amount

	precision = 0
	epoch = 1
	while precision < threshold:
		now = datetime.datetime.now()
		print("epoch " + str(epoch) + "launch: " + now.strftime('%Y-%m-%d_%H_%M_%S'))
		positive_amount = 0
		for i in range(sample_amount):
			# predict_label = np.matmul(train_mat[i], np.transpose(weights)) + bias#感知机学习算法原始形式
			predict_label =  np.matmul(alpha, np.transpose(np.multiply(Gram[i], np.transpose(train_label) ) ) )[0,0] + bias #关键步骤，将内积矩阵与yi点乘，再与alpha求内积，就得到当前感知机的输出，这是感知机学习算法的对偶形式，具体公式参考《统计学习方法》P34，其实就是舍弃了weight的存储，以alpha向量存储感知机被第i个样本优化了多少次，并以此来代替weight的存储，因为简单的感知机学习率不变，第i个样本每次预测错误时模型更新的数值都是一样的，为lr*xi*yi,
			if(predict_label*train_label[i,0] > 0):
				positive_amount += 1
				continue
			# weights += lr * train_mat[i] * train_label[i, 0]#原始形式权重更新
			# bias +=	lr * train_label[i, 0]
			alpha[0, i] += lr#对偶形式权重更新
			bias += lr*train_label[i,0]
		precision = np.float64(positive_amount) / sample_amount
		print("epoch " + str(epoch) + "finished, precision: " + str(precision))	
		epoch += 1
	#对偶形式与原始形式得到的结果是一样的
	#np.multiply()为矩阵点乘，前提是两个矩阵形状一致，若不一致而且可以执行矩阵乘法时，这个函数就会执行矩阵乘法
	#对于np.mat，即使只是一个向量，也要当成矩阵来看，取其中的值时需要表面两个维度的坐标