【ML】Principle Component Analysis 主成分分析

最新推荐文章于 2024-08-29 22:32:12 发布

Hallucination

最新推荐文章于 2024-08-29 22:32:12 发布

阅读量324

点赞数

分类专栏： Machine Learning 文章标签：机器学习

本文链接：https://blog.youkuaiyun.com/weixin_42761454/article/details/121197836

版权

PCA（主成分分析）是数据分析中的关键方法，用于高维数据降维，提取主要特征。线性PCA通过线性变换降低数据维度，而KernelPCA则引入核技巧处理非线性问题。线性PCA涉及线性编码和解码，目标是最小化重构误差。PCA的经典解决方案包括计算样本均值、协方差矩阵及其特征值分解。若数据维度远大于样本数量，计算特征值分解会变得困难，这时可以考虑使用非线性降维方法如KernelPCA。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

PCA（Principal Component Analysis）是一种常见的数据分析方式，常用于高维数据的降维，可用于提取数据的主要特征分量。

首先介绍一些关于线性降维的基本思想，用于线性PCA的实现。在文章后半部分会总结非线性PCA的实现方法，即Kernel PCA。

Linear PCA

Linear dimensionality reduction

Dimentionality reduction

Map the dataset with dimension d to the dataset with dimension k, and d > k.

$\begin{pmatrix} x_{1}\\ \vdots \\ x_{d} \end{pmatrix}_{d \times 1} \rightarrow \begin{pmatrix} \hat{y}_{1}\\ \vdots \\ \hat{y}_{k} \end{pmatrix}_{k \times 1} = \hat{y}, \ k \leqslant d$

Linear Encoding / Embedding

$\hat{y} = b + W^{T}x \newline b = \begin{pmatrix} b_{1} \\ \vdots \\ b_{k} \end{pmatrix}_{k \times 1} \ \ \ W = \begin{pmatrix} | & ...& | \\ w_{1} & ... &w_{k} \\ | &... & | \end{pmatrix} _{d \times k}$
将 X 数据集使用线性关系映射到 Y上，使维度降低。

Linear Decoding / Reconstruction

$\hat{y} = \begin{pmatrix} \hat{y}_{1} \\ \vdots \\ \hat{y}_{k} \end{pmatrix}_{k \times 1} \rightarrow \begin{pmatrix} \hat{x}_{1} \\ \vdots \\ \hat{x}_{d} \end{pmatrix}_{d \times 1} = \hat{\bold{x}}, \ \ d \geqslant k$

最低0.47元/天解锁文章