机器学习基石PLA相关

最新推荐文章于 2019-01-06 16:42:38 发布

原创最新推荐文章于 2019-01-06 16:42:38 发布 · 367 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#林轩田 #机器学习基石 #PLA

MachineLearning 专栏收录该内容

15 篇文章

订阅专栏

1.PPT中定理的证明

P:证明，当 $w_0=0$ 时，经过 $T$ 次迭代，有以下关系式：

\frac{w_{f}^{T}}{‖ w_{f} ‖} \frac{w_{T}}{‖ w_{T} ‖} \geq \sqrt{T} \cdot c o n s t a n t

$\begin{equation}\frac{w_f^T}{\|w_f\|}\frac{w_T}{\|w_T\|}\ge\sqrt{T}\cdot constant\end{equation}$

下图中通过 $w_f$ 与 $w_t$ 这两个向量之间求内积说明 $w_t$ 与 $w_f$ 之间是越来越接近的。见图1：
但是内积变大有可能是向量的坐标值的变化，要证明两个向量变的更加接近，还需解决向量长度的问题。见图2：

要证明 $\begin{equation}\frac{w_f^T}{\|w_f\|}\frac{w_T}{\|w_T\|}\ge\sqrt{T}\cdot constant\end{equation}$ 并求出 $constant$ ,首先，
根据图2, $\|w_{t+1}\|^2\le\|w_t\|^2+\mathop{max}\limits_{n}\|{y_n}{x_n}\|^2$ ,那么 $\|w_{t+T}\|^2\le\|w_t\|^2+T\cdot\mathop{max}\limits_{n}\|{y_n}{x_n}\|^2$ ,
当 $t=0$ ,并且令 $w_0=0$ 时，上式变为， $\|w_{T}\|^2\le0+T\cdot \mathop{max}\limits_{n}\|{y_n}{x_n}\|^2$
$\therefore$ $\frac{1}{\|w_{T}\|}\ge\frac{1}{\sqrt T\cdot \mathop{max}\limits_{n}\|{y_n}{x_n}\|}$
根据图1, $w_f^T w_{t+1}\ge w_f^T w_t+\mathop{min}\limits_{n}\space y_nw_f^Tx_n$ ,那么 $w_f^T w_{t+T}\ge w_f^T w_t+T\cdot\mathop{min}\limits_{n}\space y_nw_f^Tx_n$ ,当 $t=0$ 并且 $w_0=0$ 时，则得到， $w_f^T w_{T}\ge T\cdot \mathop{min}\limits_{n}\space y_nw_f^Tx_n$
$\therefore$ $\frac{w_f^T w_{T}}{\|w_f\|}\ge\frac{T\cdot \mathop{min}\limits_{n}\space y_nw_f^Tx_n}{\|w_f\|}$
将上述两式相乘，
$\therefore$ $\frac{w_f^T w_{T}}{\|w_f\|\|w_T\|}\ge\sqrt T\frac{\mathop{min}\limits_{n}\space y_nw_f^Tx_n}{\|w_f\|\mathop{max}\limits_{n}\space \|y_nx_n\|}$
由此也可以求出 $constant$ 的表达式。

2.PLA代码实现

数据下载，放到与代码同级目录下,linux下打开终端运行：

wget https://raw.githubusercontent.com/lxrobot/lxrobot-s-code/master/train_data.txt

代码:

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
Created on Sat Jul 14 17:59:47 2018

@author: lx
"""
from __future__ import division
import pandas as pd
import numpy as np
import random

def getDate(filename):
    df=pd.read_csv(filename,delim_whitespace=True,names=['x0','x1','x2','x3','y'])
    x=np.asarray(df)
    random.shuffle(x)
    x_train=x[0:300,:-1]
    x_test=x[300:400,:-1]    
    y_train=x[:300,-1]
    y_test=x[300:400,-1]
    return x_train,y_train,x_test,y_test
def sign(x):
    return -1 if x<0 else 1;
def naive_PLA(X,Y,w,b,alpha,max_steps):
    num=len(X)
    flag=1
    step=0
    for i in xrange(max_steps):
        flag=1
        for j in xrange(num):
            y_=w.dot(X[j])+b
            if sign(y_)!=Y[j]:
                print "The loss of %d step is %5.5f."%(i,y_*Y[j])
                flag=0
                w+=alpha*Y[j]*X[j]
                b+=alpha*Y[j]
                break
            else:
                continue
        if flag==1:
            step=i
            break
    return w,b,step
def getAccuracy(X,Y,w,b):
    y_=[]
    for i in range(len(Y)):
        y0=sign(X[i].dot(w)+b)
        y_.append(y0)
    y_=np.array(y_,dtype=float)
    correct=np.flatnonzero(y_-Y)
    num=len(Y)
    return y_,len(correct)/num

if __name__=='__main__':
    filename='train_data.txt'
    x_train,y_train,x_test,y_test=getDate(filename)
#   w=np.random.random((4))
#   b=random.random()
    w=np.zeros((4,))
    b=0
    alpha=0.00001
    max_step=100000
    w,b,step=naive_PLA(x_train,y_train,w,b,alpha,max_step)
    print "The actual training step is {}".format(step)
    y_,acc=getAccuracy(x_test,y_test,w,b)
    print y_[:15]
    print y_test[:15]
    print "The accuracy of PLA is %2.4f%%."%((1.0-acc)*100)