模型原型
sklearn.svm.LinearSVC(penalty=’l2’,loss=’squared_hinge’,dual=True,tol=0.0001,C=1.0,multi_class=’ovr’, fit_intercept=True,intercept_scaling=1,class_weight=None,verbose=0,random_state=None,max_iter=1000)
参数
- penalty:罚项的范数
- loss:损失函数
- ’hinge’:合页损失函数(标准SVM的损失函数)
- ‘squared_hinge’:合页损失函数的平方
- dual:
- True:解决对偶问题
- False:解决原始问题
- tol
- C:罚项参数
- multi_class:指定多类分类问题的策略
- ’ovr’:采用one-vs-rest分类策略
- ‘crammer_singer’:多类联合分类,很少用(计算量大,且精度不会更佳)
- fit_intercept:是否计算截距(常数项)
- intercept_scaling:一个人工特征,该特征对所有实例都是常数值(人工特征也参与了罚项的计算)
- class_weight
- verbose:表示是否开启verbose输出
- random_state
- max_iter
属性
- coef_
- intercept_
方法
- fit(X,y)
- predict(X)
- score(X,y[,sample_weight])
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets,linear_model,cross_validation,svm
加载数据
def load_data_classfication():
iris=datasets.load_iris()
X_train=iris.data
y_train=iris.target
return cross_validation.train_test_split(X_train,y_train,test_size=0.25,random_state=0,stratify=y_train)
使用LinearSVC类
def test_LinearSVC(*data):
X_train,X_test,y_train,y_test=data
cls=svm.LinearSVC()
cls.fit(X_train,y_train)
print('Coefficients:%s,intercept %s'%(cls.coef_,cls.intercept_))
print('Score:%.2f'%cls.score(X_test,y_test))
X_train,X_test,y_train,y_test=load_data_classfication()
test_LinearSVC(X_train,X_test,y_train,y_test)
损失函数的影响
def test_LinearSVC_loss(*data):
X_train,X_test,y_train,y_test=data
losses=['hinge','squared_hinge']
for loss in losses:
cls=svm.LinearSVC(loss=loss)
cls.fit(X_train,y_train)
print('Loss:%s'%loss)
print('Coefficients:%s,intercept %s'%(cls.coef_,cls.intercept_))
print('Score:%.2f'%cls.score(X_test,y_test))
test_LinearSVC_loss(X_train,X_test,y_train,y_test)
罚项形式的影响
def test_LinearSVC_L12(*data):
X_train,X_test,y_train,y_test=data
L12=['l1','l2']
for p in L12:
cls=svm.LinearSVC(penalty=p,dual=False)
cls.fit(X_train,y_train)
print('penalty:%s'%p)
print('Coefficients:%s,intercept %s'%(cls.coef_,cls.intercept_))
print('Score:%.2f'%cls.score(X_test,y_test))
test_LinearSVC_L12(X_train,X_test,y_train,y_test)
罚项系数C的影响
def test_LinearSVC_C(*data):
X_train,X_test,y_train,y_test=data
Cs=np.logspace(-2,1)
train_scores=[]
test_scores=[]
for C in Cs:
cls=svm.LinearSVC(C=C)
cls.fit(X_train,y_train)
train_scores.append(cls.score(X_train,y_train))
test_scores.append(cls.score(X_test,y_test))
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
ax.plot(Cs,train_scores,label='Training score')
ax.plot(Cs,test_scores,label='Testing score')
ax.set_xlabel(r'C')
ax.set_ylabel(r'score')
ax.set_xscale('log')
ax.set_title('LinearSVC')
ax.legend(loc='best')
plt.show()
test_LinearSVC_C(X_train,X_test,y_train,y_test)