《统计学习方法》第四章,测试数据同书本一样
trainingData = [
[1, 'S', -1], [1, 'M', -1], [1, 'M', 1], [1, 'S', 1], [1, 'S', -1],
[2, 'S', -1], [2, 'M', -1], [2, 'M', 1], [2, 'L', 1], [2, 'L', 1],
[3, 'L', 1], [3, 'M', 1], [3, 'M', 1], [3, 'L', 1], [3, 'L', -1]
]
# 计算先验概率
yP = 0
for record in trainingData:
if (record[2] == 1):
yP += 1
yN = len(trainingData) - yP
yPositive = yP / len(trainingData)
yNegative = 1 - yPositive
precede=[yPositive,yNegative]
# 计算属性的特征值种类
attrSet = []
for i in range(len(trainingData[0])):
attrs = []
for j in trainingData:
attrs.append(j[i])
attrSet.append(list(set(attrs)))
#print(attrSet)
# 辅助函数,用来统计
def countFun(indexes, attrs):
count = 0
for i in trainingData:
flag = True
for j in range(len(indexes)):
if i[indexes[j]] != attrs[j]:
flag = False
if flag:
count += 1
return count
# 计算条件概率
conditional = {}
for i in range(len(attrSet)):
for j in attrSet[i]:
conditional[j] = [countFun([i, 2], [j, 1]) / yP, countFun([i, 2], [j, -1]) / yN]
#print(conditional)
# 给定实例x=(2,'S'),计算概率
x = [2, 'S']
res = {}
for i in range(2):
temp = 1
temp *= precede[i]
for j in range(len(attrSet) - 1):
temp *= conditional[x[j]][i]
res[attrSet[2][i]]=temp
print(res)
运行结果:
{1: 0.02222222222222222, -1: 0.06666666666666667}
从而可认定x=(2,‘S’)归属y=-1类
参考论坛:
https://blog.youkuaiyun.com/weixin_42363997/article/details/85060134
该博客详细介绍了如何使用统计学习方法中的朴素贝叶斯分类算法,通过具体的训练数据计算先验概率和条件概率,并对给定实例进行分类。在示例中,计算了实例(2,'S')属于类别y=-1的概率,结果为0.06666666666666667,高于归属y=1的概率,从而将其归为y=-1类。
6658

被折叠的 条评论
为什么被折叠?



