1. Goal
This paper mainly deals with Sparse Principal Component Analysis(PCA) using subspace method.
2. Theorey
2.1 How to get their formulation
Notation: λ1,λ2,⋯,λp are in decreasing order.
From Ky Fan’s maximum principal 1, we know that
If we regard the last formula as a function of VV′ , it is linear. So if we change the constrain to its convex hull does not change the optimization problem. From the less well known observation that
From all the analysis, we get
How to introduce the sparsity? And which norm is suitable to use? The goal of this paper is to get sparse PCs, then we should choose penalty making V∗∈Rp×d sparse. For matrix, there are two ways to get sparsity:
- columnwise sparsity: for matrix A , each of its column is sparse, i.e. only few elements of
A∗i are nonzero. - row sparsity: for matrix A , its rows are sparse, i.e. only few rows of
A are sparse, which produce the group sparsity.
For sparse PCA, to select the import features, this paper uses row sparsity. An intuitive penalty is ∥V∥2,0 . But in high dimensional situation, ℓ0 norm is NP hard to deal. A common trick is replacing ℓ0 with ℓ1 . Then the penalty becomes ∥V∥2,1 . But our model is function of H=VV′ . So what sparsity on H can approximate well of

本文主要探讨了使用子空间方法的稀疏主成分分析(PCA)。通过引入对角线矩阵的凸包约束,将非凸的稀疏PCA问题转化为接近最优的凸松弛问题。理论部分详细解释了如何从Ky Fan的最大主值原理出发,得到公式,并讨论了引入列稀疏性和行稀疏性的方法。提出了使用行稀疏性来选择重要特征,并以范数作为惩罚项。证明了在某些正则条件下,该方法在强一致性方面表现良好。最后,介绍了算法,包括使用交替方向乘子法(ADMM)求解问题,并提供了R包代码实现。
最低0.47元/天 解锁文章
608

被折叠的 条评论
为什么被折叠?



