朴素贝叶斯算法学习学习笔记

朴素贝叶斯算法学习笔记

本文仅为了个人学习理解使用

朴素贝叶斯分类器(是一种分类方法)

贝叶斯公式

P ( A ∣ B ) = P ( A , B ) P ( B ) = P ( B ∣ A ) P ( A ) P ( B ) P(A \mid B)=\frac{P(A,B) }{P(B)}=\frac{P(B \mid A) P(A)}{P(B)} P(AB)=P(B)P(A,B)=P(B)P(BA)P(A)
其中:
P ( A ) P(A) P(A):先验概率
P ( A ∣ B ) P(A \mid B) P(AB):后验概率
P ( B ∣ A ) P(B \mid A) P(BA):在事件A发生的条件下B发生的概率,即似然函数
P ( B ) P(B) P(B):对于所有类标记都相同,所以证据因子 P ( B ) P(B) P(B)与类标记无关
P ( A i ∣ B ) = P ( B ∣ A i ) P ( A i ) ∑ j P ( B ∣ A j ) P ( A j ) P\left(A_{i} \mid B\right)=\frac{P\left(B \mid A_{i}\right) P\left(A_{i}\right)}{\sum_{j} P\left(B \mid A_{j}\right) P\left(A_{j}\right)} P(AiB)=jP(BAj)P(Aj)P(BAi)P(Ai)

分类问题

朴素贝叶斯算法(带例题解释)
分类问题
则贝叶斯公式可以表示为:
P ( y = c n ∣ x = X ) = P ( x = X , y = c n ) P ( x = X ) = P ( x = X ∣ y = c n ) P ( y = c n ) P ( x = X ) P\left(y=c_{n} \mid x=X\right)=\frac{P\left(x=X, y=c_{n}\right)}{P(x=X)}=\frac{P\left(x=X \mid y=c_{n}\right) P\left(y=c_{n}\right)}{P(x=X)} P(y=cnx=X)=P(x=X)P(x=X,y=cn)=P(x=X)P(x=Xy=cn)P(y=cn)
假设各个属性相互独立则:
P ( x = X ∣ y = c n ) = ∏ i = 1 m P ( x i = a i ∣ y = c n ) P\left(x=X \mid y=c_{n}\right)=\prod_{i=1}^{m} P\left(x^{i}=a_{i} \mid y=c_{n}\right) P(x=Xy=cn)=i=1mP(xi=aiy=cn)
朴素贝叶斯公式:在特征集合 x x x的条件下, y y y取不同值的概率。
P ( y = c n ∣ x = X ) = P ( y = c n ) ∏ i = 1 m P ( x i = a i ∣ y = c n ) P ( x = X ) P\left(y=c_{n} \mid x=X \right)=\frac{P\left(y=c_{n}\right) \prod_{i=1}^{m} P\left(x^{i}=a_{i} \mid y=c_{n}\right)}{P(x=X)} P(y=cnx=X)=P(x=X)P(y=cn)i=1mP(xi=aiy=cn)
将使得条件概率最大 y y y作为预测结果:
f ( x ) = argmax ⁡ ( P ( y = c n ) ∏ c n m P ( x i = a i ∣ y = c n ) P ( x = X ) ) f(x)=\operatorname{argmax}\left(\frac{P\left(y=c_{n}\right) \prod_{c_{n}}^{m} P\left(x^{i}=a_{i} \mid y=c_{n}\right)}{P(x=X)}\right) f(x)=argmax(P(x=X)P(y=cn)cnmP(xi=aiy=cn))
(证据因子与类标记无关)即 p ( x = X ) p(x=X) p(x=X)可以省略
f ( x ) = arg ⁡ max ⁡ c n ( P ( y = c n ) ∏ i = 1 m P ( x i = a i ∣ y = c n ) ) f(x)=\arg \max _{c_{n}}\left(P\left(y=c_{n}\right) \prod_{i=1}^{m} P\left(x^{i}=a_{i} \mid y=c_{n}\right)\right) f(x)=argcnmax(P(y=cn)i=1mP(xi=aiy=cn))
P ( y = c n ) = ∑ i = 1 N I ( y = c n ) N , n = 1 , 2 , … K P\left(y=c_{n}\right)=\frac{\sum_{i=1}^{N} I\left(y=c_{n}\right)}{N}, n=1,2, \ldots K P(y=cn)=Ni=1NI(y=cn),n=1,2,K
P ( x i = a j ∣ y = c n ) = ∑ i = 1 N I ( x i j = a j ∣ y = c n ) ∑ i = 1 N I ( y i = c n ) P\left(x^{i}=a_{j} \mid y=c_{n}\right)=\frac{\sum_{i=1}^{N} I\left(x_{i}^{j}=a_{j} \mid y=c_{n}\right)}{\sum_{i=1}^{N} I\left(y_{i}=c_{n}\right)} P(xi=ajy=cn)=i=1NI(yi=cn)i=1NI(xij=ajy=cn)

算例1

朴素贝叶斯算法的代码实例实现(python)

价格A课时B销量C价格A课时B销量C
022
212
002
010
111
222
001

预测价格A=2(高) 课时B=2(多)时销量

from __future__ import division
from numpy import array


def set_data(price, time, sale):
    price_number = []
    time_number = []
    sale_number = []
    for i in price:
        if i == "低":
            price_number.append(0)
        elif i == "中":
            price_number.append(1)
        elif i == "高":
            price_number.append(2)
    for j in time:
        if j == "少":
            time_number.append(0)
        elif j == "中":
            time_number.append(1)
        elif j == "多":
            time_number.append(2)
    for k in sale:
        if k == "低":
            sale_number.append(0)
        elif k == "中":
            sale_number.append(1)
        elif k == "高":
            sale_number.append(2)
    return price_number, time_number, sale_number

price = ["低", "高", "低", "低", "中", "高", "低"]
time = ["多", "中", "少", "中", "中", "多", "少"]
sale = ["高", "高", "高", "低", "中", "高", "中"]

price_number, time_number, sale_number = set_data(price, time, sale)
print(price_number, time_number, sale_number)

P ( C = 0 ∣ x = X ) ∝ P ( C = 0 ) P ( A = 2 ∣ C = 0 ) P ( B = 2 ∣ C = 0 ) P(C=0\mid x=X) \propto P(C=0)P(A=2 \mid C=0)P(B=2 \mid C=0) P(C=0x=X)P(C=0)P(A=2C=0)P(B=2C=0)
P ( C = 1 ∣ x = X ) ∝ P ( C = 1 ) P ( A = 2 ∣ C = 1 ) P ( B = 2 ∣ C = 1 ) P(C=1\mid x=X)\propto P(C=1)P(A=2 \mid C=1)P(B=2 \mid C=1) P(C=1x=X)P(C=1)P(A=2C=1)P(B=2C=1)
P ( C = 2 ∣ x = X ) ∝ P ( C = 2 ) P ( A = 2 ∣ C = 2 ) P ( B = 2 ∣ C = 2 ) P(C=2\mid x=X)\propto P(C=2)P(A=2 \mid C=2)P(B=2 \mid C=2) P(C=2x=X)P(C=2)P(A=2C=2)P(B=2C=2)

from __future__ import division

price_number = [0, 2, 0, 0, 1, 2, 0]
time_number = [2, 1, 0, 1, 1, 2, 0]
sale_number = [2, 2, 2, 0, 1, 2, 1]
exprice_number = 2
extime_number = 2

sale0p = sale_number.count(0)
sale1p = sale_number.count(1)
sale2p = sale_number.count(2)

a0 = 0
a1 = 0
a2 = 0
b0 = 0
b1 = 0
b2 = 0

for i in range(0, len(sale_number)):
    if price_number[i] == 2:
        if sale_number[i] == 0:
            a0 = a0 + 1
        elif sale_number[i] == 1:
            a1 = a1 + 1
        elif sale_number[i] == 2:
            a2 = a2 + 1

    if time_number[i] == 2:
        if sale_number[i] == 0:
            b0 = b0 + 1
        elif sale_number[i] == 1:
            b1 = b1 + 1
        elif sale_number[i] == 2:
            b2 = b2 + 1

pa0 = a0 / sale0p
pa1 = a1 / sale1p
pa2 = a2 / sale2p

pb0 = b0 / sale0p
pb1 = b1 / sale1p
pb2 = b2 / sale2p

pc0 = sale0p / len(sale_number)
pc1 = sale1p / len(sale_number)
pc2 = sale2p / len(sale_number)

pcc0 = pc0 * pa0 * pb0
pcc1 = pc1 * pa1 * pb1
pcc2 = pc2 * pa2 * pb2
indf = (pcc0, pcc1, pcc2)
print(indf)
max_indf = indf.index(max(indf))

if max_indf == 0:
    print('销量低')
elif max_indf == 1:
    print('销量中')
elif max_indf == 2:
    print('销量高')

引用的博文

朴素贝叶斯分类算法简单实例
朴素贝叶斯算法(带例题解释)
朴素贝叶斯算法的代码实例实现(python)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值