样本集如下:
[table]
|Day|Outlook|Temperature|Humidity|Wind|PlayTennis
|D1|Sunny|Hot|High|Weak|No
|D2|Sunny|Hot|High|Strong|No
|D3|Overcast|Hot|High|Weak|Yes
|D4|Rain|Mild|High|Weak|Yes
|D5|Rain|Cool|Normal|Weak|Yes
|D6|Rain|Cool|Normal|Strong|No
|D7|Overcast|Cool|Normal|Strong|Yes
|D8|Sunny|Mild|High|Weak|No
|D9|Sunny|Cool|Normal|Weak|Yes
|D10|Rain|Mild|Normal|Weak|Yes
|D11|Sunny|Mild|Normal|Strong|Yes
|D12|Overcast|Mild|High|Strong|Yes
|D13|Overcast|Hot|Normal|Weak|Yes
|D14|Rain|Mild|High|Strong|No
[/table]
可以看到样本数据集提供了14个训练样本,我们将使用此表的数据,并结合朴素贝叶斯分类器来分类下面的新实例:
x = (Outlook=Sunny,Temperature=Cool,Humidity=High,Wind=Strong)
在这个例子中,属性向量X=(Outlook,Temperature,Humidity,Wind),类集合Y={Yes,No},我们需要利用训练数据计算后验概率P(Yes|x)和P(No|x),如果P(Yes|x)>P(No|x),那么新实例分类为Yes,否则为No.
为了计算后验概率,我们需要计算先验概率P(Yes)和P(No)和类条件概率P(xi|Y).
先验概率计算如下:
因为有9个样本属于Yes,5个样本属于No,所以P(Yes)=9/14,P(No)=5/14.
类条件概率计算如下:
通过计算得出P(No|x)>P(Yes|x),所以该样本分类为No
[table]
|Day|Outlook|Temperature|Humidity|Wind|PlayTennis
|D1|Sunny|Hot|High|Weak|No
|D2|Sunny|Hot|High|Strong|No
|D3|Overcast|Hot|High|Weak|Yes
|D4|Rain|Mild|High|Weak|Yes
|D5|Rain|Cool|Normal|Weak|Yes
|D6|Rain|Cool|Normal|Strong|No
|D7|Overcast|Cool|Normal|Strong|Yes
|D8|Sunny|Mild|High|Weak|No
|D9|Sunny|Cool|Normal|Weak|Yes
|D10|Rain|Mild|Normal|Weak|Yes
|D11|Sunny|Mild|Normal|Strong|Yes
|D12|Overcast|Mild|High|Strong|Yes
|D13|Overcast|Hot|Normal|Weak|Yes
|D14|Rain|Mild|High|Strong|No
[/table]
可以看到样本数据集提供了14个训练样本,我们将使用此表的数据,并结合朴素贝叶斯分类器来分类下面的新实例:
x = (Outlook=Sunny,Temperature=Cool,Humidity=High,Wind=Strong)
在这个例子中,属性向量X=(Outlook,Temperature,Humidity,Wind),类集合Y={Yes,No},我们需要利用训练数据计算后验概率P(Yes|x)和P(No|x),如果P(Yes|x)>P(No|x),那么新实例分类为Yes,否则为No.
为了计算后验概率,我们需要计算先验概率P(Yes)和P(No)和类条件概率P(xi|Y).
先验概率计算如下:
因为有9个样本属于Yes,5个样本属于No,所以P(Yes)=9/14,P(No)=5/14.
类条件概率计算如下:
P(Outlook=Sunny|Yes)=2/9; P(Outlook=Sunny|No)=3/5;
P(Temperature=Cool|Yes)=3/9; P(Temperature=Cool|No)=1/5;
P(Humidity=High|Yes)=3/9; P(Humidity=High|No)=4/5;
P(Wind=Strong|Yes)=3/9; P(Wind=Strong|No)=3/5;
[/code]
后验概率计算如下:
[code="">
P(Yes|x) = P(Outlook=Sunny|Yes)×P(Temperature=Cool|Yes)×P(Humidity=High|Yes)
×P(Wind=Strong|Yes)×P(Yes) = 2/9×3/9×3/9×3/9×9/14=2/243=9/1701≈0.00529
P(No|x) = P(Outlook=Sunny|No)×P(Temperature=Cool|No)×P(Humidity=High|No)
×P(Wind=Strong|No)×P(No) = 3/5×1/5×4/5×3/5×5/14=18/875≈0.02057
通过计算得出P(No|x)>P(Yes|x),所以该样本分类为No