Tensorflow从头学(一)——支持向量机(1)_tensorflow 支持向量机-优快云博客

本文链接：https://blog.youkuaiyun.com/breakMiracle/article/details/84948362

本文介绍如何使用TensorFlow实现支持向量机(SVM)，一种用于二分类问题的机器学习算法。通过最大化分类间隔，SVM能有效地进行数据分类。文章详细解释了SVM的数学原理，并提供了一个基于鸢尾花数据集的实战案例，展示了如何构建、训练和评估SVM模型。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Tensorflow从头学(一)——支持向量机(1)

本文所有内容是参考《Tensorflow机器学习实战指南》

线性的支持向量机模型

支持向量机是一种二分的分类方法，它和逻辑回归回归的主要区别在于，真正起关键作用的点只是在分界线上的点。逻辑回归试图找到回归直线来最小化点到直线的距离，支持向量机试图最大化分类间隔。

首先给出支持向量机的超平面方程：
$A x - b = 0$
其中 $A$ 和 $x$ 是向量表示,当然这里的加号和减号的意义不大，为了配合下面的图，我使用了减号。

为了使图中的圆点和叉点能够分的足够开，即是说最大化两类点之间的间隔，也就可以转化成最小化 $A||^2$ 。同时必须满足以下约束：
$yi(Axi−b)≥1∀iy_i(Ax_i-b)\geq1 \quad {\forall}i$
上述约束是为了保证被所有点都被正确的分类情况(在实际过程中并不会如此严格要求)。对于 $y_i=-1$ 的点，其预测值 $Ax_i-b$ 如果也是-1，那么就能满足上式；同理可得 $y_i=1$ 的情况。由此我们引入svm的损失函数：
$1n∑i=1nmax(0,1−yi(Axi−b))+∣∣A∣∣2\frac {1}{n}\sum_{i=1}^nmax(0, 1-y_i(Ax_i-b))+||A||^2$
在所有数据点都被正确分类的情况下，上式的左边是恒等于0的，只有数据被点被错误分类的情况下，才不等于0。因此最小化上式，就是在最大化距离的情况下同时尽力避免错误分类点。当然你可以给上式的两个部分加上权重，表示对避免错误和最大化距离的偏向。
下面式代码

# -*- coding: utf-8 -*-
# @Time    : 2018/12/7 0:00
# @Author  : chaucerhou

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
import math

sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([[x[0],  x[3]] for x in iris.data])
y_vals = np.array([1 if y == 0 else -1 for y in iris.target])

# 分割数据集为训练集和测试集
train_indices = np.random.choice(len(x_vals), round(len(x_vals) * 0.8), replace=False)

test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))

x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]


# 设置批量大小， 占位符， 模型变量
batch_size = 100
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

A = tf.Variable(tf.random_normal(shape=[2, 1]), dtype=tf.float32)
b = tf.Variable(tf.random_normal(shape=[1,1]), dtype=tf.float32)

model_output = tf.subtract(tf.matmul(x_data, A), b)

# 最大间隔损失函数

l2_norm = tf.reduce_sum(tf.square(A))
alpha = tf.constant([0.1])

classification_term = tf.reduce_mean(tf.maximum(0., tf.subtract(1., tf.multiply(model_output, y_target))))
loss = tf.add(classification_term, tf.multiply(alpha, l2_norm))

# 预测函数和准确度函数
prediction= tf.sign(model_output)
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, y_target), tf.float32))

# 优化器
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)

init= tf.global_variables_initializer()
loss_vec = []
train_accuracy = []
test_accuracy = []
sess.run(init)
best_loss = 20.
for i in range(5000):
    rand_index = np.random.choice(len(x_vals_train), batch_size)
    rand_x = x_vals_train[rand_index]
    rand_y = np.transpose([y_vals_train[rand_index]])

    sess.run(train_step, feed_dict={x_data:rand_x, y_target:rand_y})
    temp_loss = sess.run(loss, feed_dict={x_data:rand_x, y_target:rand_y})

    loss_vec.append(temp_loss)

    train_acc_temp = sess.run(accuracy, feed_dict={x_data:x_vals_train, y_target: np.transpose([y_vals_train])})
    train_accuracy.append(train_acc_temp)

    test_acc_temp = sess.run(accuracy, feed_dict={x_data:x_vals_test, y_target:np.transpose([y_vals_test])})
    test_accuracy.append(test_acc_temp)


    if (i+1) % 100 == 0:
        print("Step # " + str(i + 1) + " A = " + str(sess.run(A)) + " b = " + str(sess.run(b)))
        print("Loss = " + str(temp_loss))


[[a1], [a2]] = sess.run(A)
[[m]] = sess.run(b)
slope = -a2 / a1
y_intercept = m / a1

x1_vals = [d[1] for d in x_vals]
best_fit = []
for i in x1_vals:
    best_fit.append(slope * i + y_intercept)

setosa_x = [d[1] for i, d in enumerate(x_vals) if y_vals[i] == 1]
setosa_y = [d[0] for i, d in enumerate(x_vals) if y_vals[i] == 1]

not_setosa_x = [d[1] for i, d in enumerate(x_vals) if y_vals[i] == -1]
not_setosa_y = [d[0] for i, d in enumerate(x_vals) if y_vals[i] == -1]


plt.plot(setosa_x, setosa_y, "o",label="I.setosa")
plt.plot(not_setosa_x, not_setosa_y, "x", label="Non-setosa")
plt.plot(x1_vals, best_fit, "r-", label="Linear Separator", linewidth=3)

plt.ylim([-0, 10])
plt.legend(loc="lower right")
plt.title("Sepal Length vs Pedal Width")
plt.xlabel("Pedal Widht")
plt.ylabel("Sepal Length")
plt.show()


plt.plot(train_accuracy, "k-", label="Training Accuracy")
plt.plot(test_accuracy, "r--", label="Test Accuracy")
plt.title("Train and Test Set Accuracies")
plt.xlabel("generation")
plt.ylabel("Accuracy")
plt.legend(loc="lower right")
plt.show()

plt.plot(loss_vec, "k-")
plt.title("Loss per Gerneration")
plt.xlabel("Generation")
plt.ylabel("Loss")
plt.show()

下面是效果图：