设样本集为 x i , i = 1 , 2 , ⋅ ⋅ ⋅ n x_i,i=1,2,···n xi,i=1,2,⋅⋅⋅n
第一种情况
如果所提供的训练样本只有一种类别,但是测试样本中可能包含着第二类样本,那么如何在训练样本只有一类的前提下来对测试样本进行正确的分类呢?
这时候一类支持向量机就应运而生了,它的基本思想是在训练的样本中计算出一个半径最小的超球面将所有测试样本都包含在这个超球体内部。那么当用这个超球体去给测试集分类的时候,落在超球体内部的样本即为第一类,落在超球体外部的样本为第二类。
在这种理想的情况下,我们求解这个超球体的可以列出以下的式子:
{
m
i
n
F
(
R
,
a
)
=
R
2
s
.
t
.
∣
∣
x
i
−
a
∣
∣
2
≤
R
2
\begin{cases} min F(R,a)=R^2\\ s.t. ||x_i-a||^2\leq R^2 \end{cases}
{minF(R,a)=R2s.t.∣∣xi−a∣∣2≤R2······································································································(1)
其中
∣
∣
x
i
−
a
∣
∣
2
=
∑
j
=
1
m
∣
x
i
(
j
)
−
a
(
j
)
∣
2
||x_i-a||^2=\sum_{j=1}^m|x_i^{(j)}-a^{(j)}|^2
∣∣xi−a∣∣2=∑j=1m∣xi(j)−a(j)∣2···········································································(2)
n为样本点的个数,m为样本点的维数。
分类函数为:
f
(
x
)
=
s
g
n
(
R
2
−
∣
∣
x
i
−
a
∣
∣
2
)
f(x)=sgn(R^2-||x_i-a||^2)
f(x)=sgn(R2−∣∣xi−a∣∣2)······················································································(3)
第二种情况
上面所说的一种情况是比较极端的,即训练集中所有样本都是第一类的,但实际情况往往是训练集中的样本包含着极少数的第二类样本,并且极有可能包含着噪声数据。这样的话如果直接用上面所述的方法来计算分类超球面就会将训练集中的第二类样本和噪声数据包含进去,是超球面的球心偏移还会使其半径增大。用这样算出来的超球面来对测试集进行分类的话精度会下降。为了解决这个问题,便加入了松弛项
ξ
i
\xi_i
ξi和惩罚调节项C。松弛项允许训练集中的一些点处于超球面的外部,惩罚调节项可以使其具备降噪功能。
在这种情况下,我们求解超球体可以列出以下的式子:
{
m
i
n
F
(
R
,
a
,
ξ
i
)
=
R
2
+
C
∑
i
=
1
n
ξ
i
s
.
t
.
∣
∣
x
i
−
a
∣
∣
2
≤
R
2
+
ξ
i
\begin{cases} min F(R,a,\xi_i)=R^2+C\sum_{i=1}^n\xi_i\\s.t. ||x_i-a||^2\leq R^2 +\xi_i\end{cases}
{minF(R,a,ξi)=R2+C∑i=1nξis.t.∣∣xi−a∣∣2≤R2+ξi·············································································(4)
分类函数为:
f
(
x
)
=
s
g
n
(
R
2
−
∣
∣
x
i
−
a
∣
∣
2
)
f(x)=sgn(R^2-||x_i-a||^2)
f(x)=sgn(R2−∣∣xi−a∣∣2)·············································································(5)
求解
第二种情况为第一种情况的一般化,所以我们直接求解第二种情况。
上面优化问题的拉格朗日函数为:
L ( R , a , ξ ) = R 2 + C ∑ i = 1 n ξ i + ∑ i = 1 n α i ( ∣ ∣ x i − a ∣ ∣ 2 − R 2 − ξ i ) − ∑ i = 1 n β i ξ i L(R,a,\xi)=R^2+C\sum_{i=1}^n\xi_i+\sum_{i=1}^n\alpha_i(||x_i-a||^2-R^2-\xi_i)-\sum_{i=1}^n\beta_i\xi_i L(R,a,ξ)=R2+C∑i=1nξi+∑i=1nαi(∣∣xi−a∣∣2−R2−ξi)−∑i=1nβiξi·························(6)
其中
α
i
≥
0
\alpha_i\ge0
αi≥0,
β
i
≥
0
\beta_i\ge0
βi≥0
令相关变量
R
R
R,
a
a
a,
ξ
\xi
ξ的偏导为零,有:
∙
\bullet
∙当
∂
L
∂
R
=
0
\frac{\partial L}{\partial R}=0
∂R∂L=0时,
∑
i
=
1
n
α
i
=
1
\sum_{i=1}^n\alpha_i=1
∑i=1nαi=1,下面是详细求导过程
∂ L ∂ R = 2 R − 2 R ∑ i = 1 n α i = 0 \frac{\partial L}{\partial R}=2R-2R\sum_{i=1}^n\alpha_i=0 ∂R∂L=2R−2R∑i=1nαi=0
∙ \bullet ∙当 ∂ L ∂ a = 0 \frac{\partial L}{\partial a}=0 ∂a∂L=0时, a = ∑ i = 1 n α i x i a=\sum_{i=1}^n\alpha_ix_i a=∑i=1nαixi,下面是详细求导过程
∂ L ∂ a = 2 ∑ i = 1 n α i ( x i − a ) ∂ ( x i − a ) ∂ a = 2 ∑ i = 1 n α i ( x i − a ) ( − 1 ) = 0 \frac{\partial L}{\partial a}=2\sum_{i=1}^n\alpha_i(x_i-a)\frac{\partial (x_i-a)}{\partial a}=2\sum_{i=1}^n\alpha_i(x_i-a)(-1)=0 ∂a∂L=2∑i=1nαi(xi−a)∂a∂(xi−a)=2∑i=1nαi(xi−a)(−1)=0
∙ \bullet ∙当 ∂ L ∂ ξ i = 0 \frac{\partial L}{\partial \xi_i}=0 ∂ξi∂L=0时, 0 ≤ α i ≤ C 0\leq\alpha_i\leq C 0≤αi≤C,下面是详细求导过程
∂
L
∂
ξ
i
=
∑
i
=
1
n
C
−
∑
i
=
1
n
α
i
−
∑
i
=
1
n
β
i
=
∑
i
=
1
n
(
C
−
α
i
−
β
i
)
=
0
\frac{\partial L}{\partial \xi_i}=\sum_{i=1}^nC-\sum_{i=1}^n\alpha_i-\sum_{i=1}^n\beta_i=\sum_{i=1}^n(C-\alpha_i-\beta_i)=0
∂ξi∂L=∑i=1nC−∑i=1nαi−∑i=1nβi=∑i=1n(C−αi−βi)=0
解得
C
−
α
i
−
β
i
=
0
C-\alpha_i-\beta_i=0
C−αi−βi=0
因为
α
i
≥
0
\alpha_i\ge0
αi≥0且
β
i
≥
0
\beta_i\ge0
βi≥0所以可以将
β
i
\beta_i
βi移除,这样的话就变成了
0
≤
α
i
≤
C
0\leq\alpha_i\leq C
0≤αi≤C
整理一下,有:
{
∑
i
=
1
n
α
i
=
1
a
=
∑
i
=
1
n
α
i
x
i
0
≤
α
i
≤
C
\begin{cases} \sum_{i=1}^n\alpha_i=1 \\ a=\sum_{i=1}^n\alpha_ix_i \\ 0\leq\alpha_i\leq C \end{cases}
⎩⎪⎨⎪⎧∑i=1nαi=1a=∑i=1nαixi0≤αi≤C·············································································(7)
将(7)中的第二个等式代入(6)中并使得(6)关于
α
i
\alpha_i
αi最大化就可以得到下面的式子:
{
m
a
x
∑
i
=
1
n
α
i
−
∑
i
=
1
n
∑
j
=
1
n
α
i
α
j
(
x
i
⋅
x
j
)
s
.
t
.
∑
i
=
1
n
α
i
=
1
,
0
≤
α
i
≤
C
\begin{cases} max\sum_{i=1}^n\alpha_i-\sum_{i=1}^n\sum_{j=1}^n\alpha_i \alpha_j(x_i\cdot{x_j})\\ s.t.\sum_{i=1}^n\alpha_i=1,0\leq\alpha_i\leq C \end{cases}
{max∑i=1nαi−∑i=1n∑j=1nαiαj(xi⋅xj)s.t.∑i=1nαi=1,0≤αi≤C············································(8)
求解这个式子就可以把
α
i
\alpha_i
αi求出来,然后代入(7)式中的第二个等式就可以求出球心
a
a
a。通常只有少量的
α
i
≥
0
\alpha_i\ge0
αi≥0,这些大于等于0的
α
i
\alpha_i
αi所对应的
x
i
x_i
xi即为支持向量。
知道球心之后就可以计算出样本点到球心的距离:
D 2 = ∣ ∣ x − a ∣ ∣ 2 D^2=||x-a||^2 D2=∣∣x−a∣∣2············································(9)
将 a = ∑ i = 1 n α i x i a=\sum_{i=1}^n\alpha_ix_i a=∑i=1nαixi代入(9),即可求出D
D 2 = ( x ⋅ x ) − 2 ∑ i = 1 n α i ( x , x i ) + ∑ i = 1 n ∑ j = 1 m α i α j ( x i ⋅ x j ) D^2=(x\cdot{x})-2\sum_{i=1}^n\alpha_i(x,x_i)+\sum_{i=1}^n\sum_{j=1}^m\alpha_i\alpha_j(x_i\cdot{x_j}) D2=(x⋅x)−2∑i=1nαi(x,xi)+∑i=1n∑j=1mαiαj(xi⋅xj)············································(10)
接下来将超球面上任一支持向量
x
m
x_m
xm代入(10)即可求出超球体的半径R:
R
2
=
(
x
m
⋅
x
m
)
−
2
∑
i
=
1
n
α
i
(
x
m
,
x
i
)
+
∑
i
=
1
n
∑
j
=
1
m
α
i
α
j
(
x
i
⋅
x
j
)
R^2=(x_m\cdot{x_m})-2\sum_{i=1}^n\alpha_i(x_m,x_i)+\sum_{i=1}^n\sum_{j=1}^m\alpha_i\alpha_j(x_i\cdot{x_j})
R2=(xm⋅xm)−2∑i=1nαi(xm,xi)+∑i=1n∑j=1mαiαj(xi⋅xj)·······································(11)
对于非线性情况,直接在式(10),(11)中代入相应的核函数 k ( x , y ) k(x,y) k(x,y)即可。
分类函数为:
f
(
x
)
=
s
g
n
(
R
2
−
D
2
)
f(x)=sgn(R^2-D^2)
f(x)=sgn(R2−D2)······················································································(3)
将此结果代入原式得:
可参考 常用范数求导.

被折叠的 条评论
为什么被折叠?



