Introduction to Support Vector Machines

本文介绍如何使用OpenCV库中的支持向量机(SVM)进行分类任务。通过构建一个简单的二分类问题,解释了SVM的基本原理,包括最优超平面的计算方法和支持向量的概念,并展示了如何训练一个SVM模型并用它来进行预测。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Goal

In this tutorial you will learn how to:

What is a SVM?

A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples.

In which sense is the hyperplane obtained optimal? Let’s consider the followingsimple problem:

For a linearly separable set of 2D-points which belong to one of two classes, find a separating straight line.
A seperation example

Note

In this example we deal with lines and points in the Cartesian plane instead of hyperplanes and vectors in a high dimensional space. This is a simplification of the problem.It is important to understand that this is done only because our intuition is better built from examples that are easy to imagine. However, the same concepts apply to tasks where the examples to classify lie in a space whose dimension is higher than two.

In the above picture you can see that there exists multiple lines that offer a solution to the problem. Is any of them better than the others? We can intuitively define a criterion to estimate the worth of the lines:

A line is bad if it passes too close to the points because it will be noise sensitive and it will not generalize correctly. Therefore, our goal should be to find the line passing as far as possible from all points.

Then, the operation of the SVM algorithm is based on finding the hyperplane that gives the largest minimum distance to the training examples. Twice, this distance receives the important name of margin within SVM’s theory. Therefore, the optimal separating hyperplane maximizes the margin of the training data.

The Optimal hyperplane

How is the optimal hyperplane computed?

Let’s introduce the notation used to define formally a hyperplane:

f(x) = \beta_{0} + \beta^{T} x,

where \beta is known as the weight vector and \beta_{0} as the bias.

See also

A more in depth description of this and hyperplanes you can find in the section 4.5 (Seperating Hyperplanes) of the book: Elements of Statistical Learning by T. Hastie, R. Tibshirani and J. H. Friedman.

The optimal hyperplane can be represented in an infinite number of different ways by scaling of \beta and \beta_{0}. As a matter of convention, among all the possible representations of the hyperplane, the one chosen is

|\beta_{0} + \beta^{T} x| = 1

where x symbolizes the training examples closest to the hyperplane. In general, the training examples that are closest to the hyperplane are called support vectors. This representation is known as the canonical hyperplane.

Now, we use the result of geometry that gives the distance between a point x and a hyperplane (\beta, \beta_{0}):

\mathrm{distance} = \frac{|\beta_{0} + \beta^{T} x|}{||\beta||}.

In particular, for the canonical hyperplane, the numerator is equal to one and the distance to the support vectors is

\mathrm{distance}_{\text{ support vectors}} = \frac{|\beta_{0} + \beta^{T} x|}{||\beta||} = \frac{1}{||\beta||}.

Recall that the margin introduced in the previous section, here denoted as M, is twice the distance to the closest examples:

M = \frac{2}{||\beta||}

Finally, the problem of maximizing M is equivalent to the problem of minimizing a function L(\beta) subject to some constraints. The constraints model the requirement for the hyperplane to classify correctly all the training examples x_{i}. Formally,

\min_{\beta, \beta_{0}} L(\beta) = \frac{1}{2}||\beta||^{2} \text{ subject to } y_{i}(\beta^{T} x_{i} + \beta_{0}) \geq 1 \text{ } \forall i,

where y_{i} represents each of the labels of the training examples.

This is a problem of Lagrangian optimization that can be solved using Lagrange multipliers to obtain the weight vector \beta and the bias \beta_{0} of the optimal hyperplane.

Source Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/ml/ml.hpp>

using namespace cv;

int main()
{
    // Data for visual representation
    int width = 512, height = 512;
    Mat image = Mat::zeros(height, width, CV_8UC3);

    // Set up training data
    float labels[4] = {1.0, -1.0, -1.0, -1.0};
    Mat labelsMat(4, 1, CV_32FC1, labels);

    float trainingData[4][2] = { {501, 10}, {255, 10}, {501, 255}, {10, 501} };
    Mat trainingDataMat(4, 2, CV_32FC1, trainingData);

    // Set up SVM's parameters
    CvSVMParams params;
    params.svm_type    = CvSVM::C_SVC;
    params.kernel_type = CvSVM::LINEAR;
    params.term_crit   = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);

    // Train the SVM
    CvSVM SVM;
    SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params);

    Vec3b green(0,255,0), blue (255,0,0);
    // Show the decision regions given by the SVM
    for (int i = 0; i < image.rows; ++i)
        for (int j = 0; j < image.cols; ++j)
        {
            Mat sampleMat = (Mat_<float>(1,2) << i,j);
            float response = SVM.predict(sampleMat);

            if (response == 1)
                image.at<Vec3b>(j, i)  = green;
            else if (response == -1)
                 image.at<Vec3b>(j, i)  = blue;
        }

    // Show the training data
    int thickness = -1;
    int lineType = 8;
    circle( image, Point(501,  10), 5, Scalar(  0,   0,   0), thickness, lineType);
    circle( image, Point(255,  10), 5, Scalar(255, 255, 255), thickness, lineType);
    circle( image, Point(501, 255), 5, Scalar(255, 255, 255), thickness, lineType);
    circle( image, Point( 10, 501), 5, Scalar(255, 255, 255), thickness, lineType);

    // Show support vectors
    thickness = 2;
    lineType  = 8;
    int c     = SVM.get_support_vector_count();

    for (int i = 0; i < c; ++i)
    {
        const float* v = SVM.get_support_vector(i);
        circle( image,  Point( (int) v[0], (int) v[1]),   6,  Scalar(128, 128, 128), thickness, lineType);
    }

    imwrite("result.png", image);        // save the image

    imshow("SVM Simple Example", image); // show it to the user
    waitKey(0);

}

Explanation

  1. Set up the training data

The training data of this exercise is formed by a set of labeled 2D-points that belong to one of two different classes; one of the classes consists of one point and the other of three points.

float labels[4] = {1.0, -1.0, -1.0, -1.0};
float trainingData[4][2] = {{501, 10}, {255, 10}, {501, 255}, {10, 501}};

The function CvSVM::train that will be used afterwards requires the training data to be stored as Mat objects of floats. Therefore, we create these objects from the arrays defined above:

Mat trainingDataMat(3, 2, CV_32FC1, trainingData);
Mat labelsMat      (3, 1, CV_32FC1, labels);
  1. Set up SVM’s parameters

    In this tutorial we have introduced the theory of SVMs in the most simple case, when the training examples are spread into two classes that are linearly separable. However, SVMs can be used in a wide variety of problems (e.g. problems with non-linearly separable data, a SVM using a kernel function to raise the dimensionality of the examples, etc). As a consequence of this, we have to define some parameters before training the SVM. These parameters are stored in an object of the class CvSVMParams .

    CvSVMParams params;
    params.svm_type    = CvSVM::C_SVC;
    params.kernel_type = CvSVM::LINEAR;
    params.term_crit   = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);
    
    • Type of SVM. We choose here the type CvSVM::C_SVC that can be used for n-class classification (n \geq 2). This parameter is defined in the attribute CvSVMParams.svm_type.

      Note

      The important feature of the type of SVM CvSVM::C_SVC deals with imperfect separation of classes (i.e. when the training data is non-linearly separable). This feature is not important here since the data is linearly separable and we chose this SVM type only for being the most commonly used.

    • Type of SVM kernel. We have not talked about kernel functions since they are not interesting for the training data we are dealing with. Nevertheless, let’s explain briefly now the main idea behind a kernel function. It is a mapping done to the training data to improve its resemblance to a linearly separable set of data. This mapping consists of increasing the dimensionality of the data and is done efficiently using a kernel function. We choose here the type CvSVM::LINEAR which means that no mapping is done. This parameter is defined in the attribute CvSVMParams.kernel_type.

    • Termination criteria of the algorithm. The SVM training procedure is implemented solving a constrained quadratic optimization problem in an iterative fashion. Here we specify a maximum number of iterations and a tolerance error so we allow the algorithm to finish in less number of steps even if the optimal hyperplane has not been computed yet. This parameter is defined in a structure cvTermCriteria.

  2. Train the SVM

    We call the method CvSVM::train to build the SVM model.

    CvSVM SVM;
    SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params);
    
  3. Regions classified by the SVM

The method CvSVM::predict is used to classify an input sample using a trained SVM. In this example we have used this method in order to color the space depending on the prediction done by the SVM. In other words, an image is traversed interpreting its pixels as points of the Cartesian plane. Each of the points is colored depending on the class predicted by the SVM; in green if it is the class with label 1 and in blue if it is the class with label -1.

Vec3b green(0,255,0), blue (255,0,0);

for (int i = 0; i < image.rows; ++i)
    for (int j = 0; j < image.cols; ++j)
    {
    Mat sampleMat = (Mat_<float>(1,2) << i,j);
    float response = SVM.predict(sampleMat);

    if (response == 1)
       image.at<Vec3b>(j, i)  = green;
    else
    if (response == -1)
       image.at<Vec3b>(j, i)  = blue;
    }
  1. Support vectors

    We use here a couple of methods to obtain information about the support vectors. The method CvSVM::get_support_vector_count outputs the total number of support vectors used in the problem and with the method CvSVM::get_support_vector we obtain each of the support vectors using an index. We have used this methods here to find the training examples that are support vectors and highlight them.

    int c     = SVM.get_support_vector_count();
    
    for (int i = 0; i < c; ++i)
    {
    const float* v = SVM.get_support_vector(i); // get and then highlight with grayscale
    circle(   image,  Point( (int) v[0], (int) v[1]),   6,  Scalar(128, 128, 128), thickness, lineType);
    }
    

Results

  • The code opens an image and shows the training examples of both classes. The points of one class are represented with white circles and black ones are used for the other class.
  • The SVM is trained and used to classify all the pixels of the image. This results in a division of the image in a blue region and a green region. The boundary between both regions is the optimal separating hyperplane.
  • Finally the support vectors are shown using gray rings around the training examples.
The seperated planes
资源下载链接为: https://pan.quark.cn/s/1bfadf00ae14 “STC单片机电压测量”是一个以STC系列单片机为基础的电压检测应用案例,它涵盖了硬件电路设计、软件编程以及数据处理等核心知识点。STC单片机凭借其低功耗、高性价比和丰富的I/O接口,在电子工程领域得到了广泛应用。 STC是Specialized Technology Corporation的缩写,该公司的单片机基于8051内核,具备内部振荡器、高速运算能力、ISP(在系统编程)和IAP(在应用编程)功能,非常适合用于各种嵌入式控制系统。 在源代码方面,“浅雪”风格的代码通常简洁易懂,非常适合初学者学习。其中,“main.c”文件是程序的入口,包含了电压测量的核心逻辑;“STARTUP.A51”是启动代码,负责初始化单片机的硬件环境;“电压测量_uvopt.bak”和“电压测量_uvproj.bak”可能是Keil编译器的配置文件备份,用于设置编译选项和项目配置。 对于3S锂电池电压测量,3S锂电池由三节锂离子电池串联而成,标称电压为11.1V。测量时需要考虑电池的串联特性,通过分压电路将高电压转换为单片机可接受的范围,并实时监控,防止过充或过放,以确保电池的安全和寿命。 在电压测量电路设计中,“电压测量.lnp”文件可能包含电路布局信息,而“.hex”文件是编译后的机器码,用于烧录到单片机中。电路中通常会使用ADC(模拟数字转换器)将模拟电压信号转换为数字信号供单片机处理。 在软件编程方面,“StringData.h”文件可能包含程序中使用的字符串常量和数据结构定义。处理电压数据时,可能涉及浮点数运算,需要了解STC单片机对浮点数的支持情况,以及如何高效地存储和显示电压值。 用户界面方面,“电压测量.uvgui.kidd”可能是用户界面的配置文件,用于显示测量结果。在嵌入式系统中,用
资源下载链接为: https://pan.quark.cn/s/abbae039bf2a 在 Android 开发中,Fragment 是界面的一个模块化组件,可用于在 Activity 中灵活地添加、删除或替换。将 ListView 集成到 Fragment 中,能够实现数据的动态加载与列表形式展示,对于构建复杂且交互丰富的界面非常有帮助。本文将详细介绍如何在 Fragment 中使用 ListView。 首先,需要在 Fragment 的布局文件中添加 ListView 的 XML 定义。一个基本的 ListView 元素代码如下: 接着,创建适配器来填充 ListView 的数据。通常会使用 BaseAdapter 的子类,如 ArrayAdapter 或自定义适配器。例如,创建一个简单的 MyListAdapter,继承自 ArrayAdapter,并在构造函数中传入数据集: 在 Fragment 的 onCreateView 或 onActivityCreated 方法中,实例化 ListView 和适配器,并将适配器设置到 ListView 上: 为了提升用户体验,可以为 ListView 设置点击事件监听器: 性能优化也是关键。设置 ListView 的 android:cacheColorHint 属性可提升滚动流畅度。在 getView 方法中复用 convertView,可减少视图创建,提升性能。对于复杂需求,如异步加载数据,可使用 LoaderManager 和 CursorLoader,这能更好地管理数据加载,避免内存泄漏,支持数据变更时自动刷新。 总结来说,Fragment 中的 ListView 使用涉及布局设计、适配器创建与定制、数据绑定及事件监听。掌握这些步骤,可构建功能强大的应用。实际开发中,还需优化 ListView 性能,确保应用流畅运
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值