Why data normalization in SVM

最新推荐文章于 2025-05-09 10:48:18 发布

原创最新推荐文章于 2025-05-09 10:48:18 发布 · 5.4k 阅读

4 ·

CC 4.0 BY-SA版权

Machine Learning 专栏收录该内容

2 篇文章

订阅专栏

本文阐述了数据标准化在机器学习中的重要性，并介绍了三种常用的数据标准化方法：重新缩放、标准化和单位长度缩放。此外，还讨论了某些情况下可能不需要进行数据标准化的情形。

部署运行你感兴趣的模型镜像

Data normalization is generally performed during the data pre-processing step.

1. why we need normalization

There are two major reasons that data normalization is so essential for machine learning algorithm.

Data normalization can promote the performance in common machine learning problems.

Most classifiers will calculate the Euclidean distance between two points. If one of the features has a broad range of values, the distance will be governed by this particular feature. Thus, the range of all features should be normalized so that each feature contributes approximately proportionately to the final distance.

Data normalization can speed up the coverage of gradient descent algorithm.

Let's illustrate this using a screenshot from Andrew's machine learning course

2. how to normalize data

Three common methods are used to perform feature normalization in machine learning algorithms.

Rescaling

The simplest method is rescaling the range of features by linear function. The common formula is given as:

$x'=\frac{x-min\left(x\right)}{max\left(x\right)-min\left(x\right)}(1)$

$x'=\frac{2x-max\left(x\right)-min\left(x\right)}{max\left(x\right)-min\left(x\right)}(2)$

where $x$ is the original value, $x'$ is the normalized value.

The equation (1) rescales data into [0,1], and the equation (2) rescales data into [-1,1].

Note: the parameters $max(x)$ and $min(x)$ should be computed in the training data only, but will be used in the training, validation, and testing data later.

There are also some methods to normalize the features using non-linear function, such as

logarithmic function: $x'=log_{10}(x)$

inverse tangent function: $x'=\frac{2}{\pi}arctan(x)$

sigmoid function: $x'=\frac{1}{1+e^(-x)}$

Standardization

Feature standardization makes the values of each feature in the data have zero-mean and unit-variance. This method is widely used for normalization in many machine learning algorithms (e.g., support vector machines,logistic regression, and neural networks). The general formula is given as:

$x'=\frac{x-\bar{x}}{\sigma}$

where $\sigma$ is the standard deviation of the feature $x$ .

Scaling to unit length

Another option that is widely used in machine-learning is to scale the components of a feature vector such the complete vector has length one:

$x'=\frac{x}{||x||}$

This is especially important if the Scalar Metric is used as a distance measure in the following learning steps.

3. Some cases you don't need data normalization

3.1 using a similarity function instead of distance function

You can propose a similarity function rather than a distance function and plug it in a kernel (technically this function must generate positive-definite matrices).

3.2 random reforest

Random forest never compare one feature with another in magnitude, so the ranges don't matter.

Reference

[1] http://en.wikipedia.org/wiki/Feature_scaling

[2] http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=MachineLearning&video=03.1-LinearRegressionII-FeatureScaling

[3] http://stats.stackexchange.com/questions/57010/is-it-essential-to-do-normalization-for-svm-and-random-forest

您可能感兴趣的与本文相关的镜像