最近想写点关于machine learning的东东。对于ml,我是一个freshman,因此本博客仅作为记录学习的过程。
先写点关于分类器的
规定数据格式
我用的数据格式是.arff,貌似在tencent实习的时候,见过Weka用过这个格式。这个格式的example如下:
@RELATION TENNIS
@ATTRIBUTE outlook {sunny, overcast, rain}
@ATTRIBUTE temperature {hot, mild, cool}
@ATTRIBUTE humidity {high, normal, low}
@ATTRIBUTE wind {weak, strong}
@ATTRIBUTE play {yes, no}
@DATA
Sunny,Hot,High,Weak,No
Sunny,Hot,High,Strong,No
Overcast,Hot,High,Weak,Yes
Rain,Mild,High,Weak,Yes
Rain,Cool,Normal,Weak,Yes
Rain,Cool,Normal,Strong,No
Overcast,Cool,Normal,Strong,Yes
Sunny,Mild,High,Weak,No
Sunny,Cool,Normal,Weak,Yes
Rain,Mild,Normal,Weak,Yes
Sunny,Mild,Normal,Strong,Yes
Overcast,Mild,High,Strong,Yes
Overcast,Hot,Normal,Weak,Yes
Rain,Mild,High,Strong,No