使用决策树预测隐形眼镜类型
最近在学习决策树,在学习机器学习的过程中我发现这两本书确实很好用,推荐给大家~机器学习实战和机器学习周志华(链接是在某宝随便找的~)。
下面给大家奉上隐形眼镜数据集吧,在网上找了一会儿
young myope no reduced no lenses
young myope no normal soft
young myope yes reduced no lenses
young myope yes normal hard
young hyper no reduced no lenses
young hyper no normal soft
young hyper yes reduced no lenses
young hyper yes normal hard
pre myope no reduced no lenses
pre myope no normal soft
pre myope yes reduced no lenses
pre myope yes normal hard
pre hyper no reduced no lenses
pre hyper no normal soft
pre hyper yes reduced no lenses
pre hyper yes normal no lenses
presbyopic myope no reduced no lenses
presbyopic myope no normal no lenses
presbyopic myope yes reduced no lenses
presbyopic myope yes normal hard
presbyopic hyper no reduced no lenses
presbyopic hyper no normal soft
presbyopic hyper yes reduced no lenses
presbyopic hyper yes normal no lenses
直接将这24段粘贴到自己的TXT文件中就可以啦~
决策树的预测代码如下:
import imp
import trees
imp.reload(trees)
imp.reload(treePlotter)
#使用决策树预测隐形眼镜类型
fr = open('lenses.txt','r')
lenses = [inst.strip().split('\t') for inst in fr.readlines()]
#整理lenses.txt文件,让它变成列表
for value in lenses:
for n in range(len(value)):
value[n]=value[n].replace(' ','')
print(lenses)
lensesLabels=['age','prescript','astigmatic','tearRate']
lensesTree=trees.createTree(lenses,lensesLabels)
treePlotter.createPlot(lensesTree)
由于lenses.txt中有空格并且要分列所以用到了for循环、strip()和split(),最后的结果如下图:
用了Matplotlib注解绘制树形图之后可以对分类过程一目了然,很好的展示了决策树的优势。当然代码中的trees.createTree()和treePlotter.createPlot()都是事先编码好的,由于机器学习实战用的是python2(应该是这样),在用python3跑书上的代码时会有一些报错的地方,下面我把改过的trees.py和treePlotter.py的代码也附上。注:我用的是python3.5
trees.py:
#计算香农熵
from math import log
def calcShannonEnt(dataSet)