Tensorflow从头学(一)——支持向量机(1)
本文所有内容是参考《Tensorflow机器学习实战指南》
线性的支持向量机模型
支持向量机是一种二分的分类方法,它和逻辑回归回归的主要区别在于,真正起关键作用的点只是在分界线上的点。逻辑回归试图找到回归直线来最小化点到直线的距离,支持向量机试图最大化分类间隔。
首先给出支持向量机的超平面方程:
Ax−b=0Ax - b = 0Ax−b=0
其中AAA和xxx是向量表示,当然这里的加号和减号的意义不大,为了配合下面的图,我使用了减号。

为了使图中的圆点和叉点能够分的足够开,即是说最大化两类点之间的间隔,也就可以转化成最小化∣∣A∣∣2||A||^2∣∣A∣∣2。同时必须满足以下约束:
yi(Axi−b)≥1∀iy_i(Ax_i-b)\geq1 \quad {\forall}iyi(Axi−b)≥1∀i
上述约束是为了保证被所有点都被正确的分类情况(在实际过程中并不会如此严格要求)。对于yi=−1y_i=-1yi=−1的点,其预测值Axi−bAx_i-bAxi−b如果也是-1,那么就能满足上式;同理可得yi=1y_i=1yi=1的情况。由此我们引入svm的损失函数:
1n∑i=1nmax(0,1−yi(Axi−b))+∣∣A∣∣2\frac {1}{n}\sum_{i=1}^nmax(0, 1-y_i(Ax_i-b))+||A||^2n1i=1∑nmax(0,1−yi(Axi−b))+∣∣A∣∣2
在所有数据点都被正确分类的情况下,上式的左边是恒等于0的,只有数据被点被错误分类的情况下,才不等于0。因此最小化上式,就是在最大化距离的情况下同时尽力避免错误分类点。当然你可以给上式的两个部分加上权重,表示对避免错误和最大化距离的偏向。
下面式代码
# -*- coding: utf-8 -*-
# @Time : 2018/12/7 0:00
# @Author : chaucerhou
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
import math
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([[x[0], x[3]] for x in iris.data])
y_vals = np.array([1 if y == 0 else -1 for y in iris.target])
# 分割数据集为训练集和测试集
train_indices = np.random.choice(len(x_vals), round(len(x_vals) * 0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
# 设置批量大小, 占位符, 模型变量
batch_size = 100
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[2, 1]), dtype=tf.float32)
b = tf.Variable(tf.random_normal(shape=[1,1]), dtype=tf.float32)
model_output = tf.subtract(tf.matmul(x_data, A), b)
# 最大间隔损失函数
l2_norm = tf.reduce_sum(tf.square(A))
alpha = tf.constant([0.1])
classification_term = tf.reduce_mean(tf.maximum(0., tf.subtract(1., tf.multiply(model_output, y_target))))
loss = tf.add(classification_term, tf.multiply(alpha, l2_norm))
# 预测函数和准确度函数
prediction= tf.sign(model_output)
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, y_target), tf.float32))
# 优化器
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)
init= tf.global_variables_initializer()
loss_vec = []
train_accuracy = []
test_accuracy = []
sess.run(init)
best_loss = 20.
for i in range(5000):
rand_index = np.random.choice(len(x_vals_train), batch_size)
rand_x = x_vals_train[rand_index]
rand_y = np.transpose([y_vals_train[rand_index]])
sess.run(train_step, feed_dict={x_data:rand_x, y_target:rand_y})
temp_loss = sess.run(loss, feed_dict={x_data:rand_x, y_target:rand_y})
loss_vec.append(temp_loss)
train_acc_temp = sess.run(accuracy, feed_dict={x_data:x_vals_train, y_target: np.transpose([y_vals_train])})
train_accuracy.append(train_acc_temp)
test_acc_temp = sess.run(accuracy, feed_dict={x_data:x_vals_test, y_target:np.transpose([y_vals_test])})
test_accuracy.append(test_acc_temp)
if (i+1) % 100 == 0:
print("Step # " + str(i + 1) + " A = " + str(sess.run(A)) + " b = " + str(sess.run(b)))
print("Loss = " + str(temp_loss))
[[a1], [a2]] = sess.run(A)
[[m]] = sess.run(b)
slope = -a2 / a1
y_intercept = m / a1
x1_vals = [d[1] for d in x_vals]
best_fit = []
for i in x1_vals:
best_fit.append(slope * i + y_intercept)
setosa_x = [d[1] for i, d in enumerate(x_vals) if y_vals[i] == 1]
setosa_y = [d[0] for i, d in enumerate(x_vals) if y_vals[i] == 1]
not_setosa_x = [d[1] for i, d in enumerate(x_vals) if y_vals[i] == -1]
not_setosa_y = [d[0] for i, d in enumerate(x_vals) if y_vals[i] == -1]
plt.plot(setosa_x, setosa_y, "o",label="I.setosa")
plt.plot(not_setosa_x, not_setosa_y, "x", label="Non-setosa")
plt.plot(x1_vals, best_fit, "r-", label="Linear Separator", linewidth=3)
plt.ylim([-0, 10])
plt.legend(loc="lower right")
plt.title("Sepal Length vs Pedal Width")
plt.xlabel("Pedal Widht")
plt.ylabel("Sepal Length")
plt.show()
plt.plot(train_accuracy, "k-", label="Training Accuracy")
plt.plot(test_accuracy, "r--", label="Test Accuracy")
plt.title("Train and Test Set Accuracies")
plt.xlabel("generation")
plt.ylabel("Accuracy")
plt.legend(loc="lower right")
plt.show()
plt.plot(loss_vec, "k-")
plt.title("Loss per Gerneration")
plt.xlabel("Generation")
plt.ylabel("Loss")
plt.show()
下面是效果图:


