Week 11: 深度学习补遗：支持向量机

深度学习之支持向量机知识学习

最新推荐文章于 2025-11-30 23:47:27 发布

原创

最新推荐文章于 2025-11-30 23:47:27 发布 · 914 阅读

11 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #支持向量机 #人工智能

文章目录

Week 11: 深度学习补遗：支持向量机

Week 11: 深度学习补遗：支持向量机

摘要

本周主要继续跟进李宏毅老师的进度，学习支持向量机相关的知识，研究其底层数学原理与数学推导。

Abstract

This week, we will continue to follow up on Mr. Hung-yi Lee’s progress, learning about support vector machines and studying their underlying mathematical principles and mathematical derivations.

1.Support Vector Machine 支持向量机

“Hinge Loss + Kernel Tree = Support Vector Machine”

对于一个基本的二元分类模型，可以描述如下：
$\\ \begin{bmatrix} x^1&&x^2&&x^3&&\dots \\ \hat{y}^1&&\hat{y}^2&&\hat{y}^3&&\dots \end{bmatrix} \\ \hat{y}^n=+1,-1$

1. 函数建模

$g(x)=\left\{ \begin{aligned} f(x)>0\quad Output=+1 \\ f(x)<0\quad Output=-1 \end{aligned} \right.$

1. 损失函数

$L(f)=\sum_n \delta (g(x^n)\neq \hat{y}^n)$

因为损失函数不可微，因此这个模型无法用梯度下降法求解。所以，我们要考虑用一个可微的函数来替代这个理想的损失函数。
$\begin{aligned} \text{Ideal Loss: }L(f)&=\sum_n\delta (g(x^n)\neq \hat{y}^n) \\ \text{Approximation: }L(f)&=\sum_nl(f(x^n),\hat{y}^n) \end{aligned}$
比如，平方损失，即如果 $\hat{y}^n=1$ ，那么 $f (x)$ 接近1；如果 $\hat{y}^n=-1$ ， $f (x)$ 接近负一，可以描述为 $l(f(x^n),\hat{y}^n)=(\hat{y}^nf(x^n)-1)^2$ 。简单来说，就是当 $\hat{y}^n=1$ 时， $l(f(x^n),\hat{y}^n)=(f(x^n)-1)^2$ ；当 $\hat{y}^n=-1$ 时， $l(f(x^n),\hat{y}^n)=(-f(x^n)-1)^2$ 。
$\begin{aligned} \text{Square Loss: }l(f(x^n),\hat{y}^n)&=(\hat{y}^nf(x^n)-1)^2 \\ &=\left\{ \begin{aligned} (f(x^n)-1)^2\quad &\text{while }\hat{y}^n=1 \\ (-f(x^n)-1)^2\quad &\text{while }\hat{y}^n=-1 \end{aligned} \right. \end{aligned}$