SVM的简单应用

最新推荐文章于 2024-06-11 13:10:45 发布

原创最新推荐文章于 2024-06-11 13:10:45 发布 · 853 阅读

1 ·

CC 4.0 BY-SA版权

https://www.python.org/downloads/windows/去这个网址找到对应的python版本点击打开链接

我这里下载的是window 64 位的python35

http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy 去这个网址下载对应的包点击打开链接

在我的电脑上是python35 所以下载的是

numpy:

numpy-1.13.1+mkl-cp35-cp35m-win_amd64.whl 这里的cp35就是指的和python35相对应，后面的win_amd64代表的是win 64位

scipy:

scipy-0.19.1-cp35-cp35m-win_amd64.whl

命令行：

pip install numpy-1.13.1+mkl-cp35-cp35m-win_amd64.whl
pip install scipy-0.19.1-cp35-cp35m-win_amd64.whl

pip install -U scikit-learn

就可以安装scikit-learn，里面可以调用svm

简单的例子：

import time
import pandas as pd
import numpy as np
from sklearn.preprocessing import Imputer
from sklearn.linear_model import LogisticRegression
from   sklearn    import svm 

def main():
    print time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))
    print("Loading the data ...")
    # Load the data from the CSV files
    train = pd.read_csv('train.csv', header=0)  #加载训练的数据  
    test = pd.read_csv('test.csv', header=0)     #加载测试的数据
    Y=np.array(train)[:,118]   
    X_train = np.array(train)[:, :59]
    ID = test['ID']
    X_test = np.array(test)[:, 1:60]
    
    print("Missing values imputation ...")
    imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
    imp.fit(X_train)
    X_train = imp.transform(X_train)
    X_test = imp.transform(X_test)
    
    print("Training the LR model ...")
    clf=svm.SVC() 
    ##clf=svm.SVC(kernel='linear') 
    #clf=svm.SVC(kernel='poly',max_iter=200) 
    #clf=svm.SVC(kernel='poly')
    clf.probability=True
    clf.fit(X_train,Y)

    print("Predicting the Competition Data...")

    tt=clf.predict_proba(X_test)


    pred=tt[:, 1]                  # Get the probabilty of being 1.
    #pred_df = pd.DataFrame(data={'Target': pred})
    pred_df = pd.DataFrame(data={'Target': pred})
    submissions = pd.DataFrame(ID).join(pred_df)
    
    submissions.to_csv("result.csv", index=False)
    print time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))

    ##
    #################################





## Here the main program.

if __name__ == '__main__':
    main()

数据可以到github中下载点击打开链接