PCA原理
PCA实施注意事项
- 注意事项
1)PCA主要解决feature 之间 linear dependency的问题;
2)PCA核心思想:maximize the variance of data point in the new feature space;
3)利用PCA之前,要先对data进行“去中心化”。 - hyperparameter k(principle component 数量)选取:
根据想要的variance比重,来选取k值,given k,variance比重,可以根据“特征值”来确定:
- PCA downsides
1)hard to interpret;
2)由于PCA的计算是通过SVD进行的,因此,其computationaly expensive。如果data feature达到 a few thousand features,最好不用; - Suggestion
it is best not to apply PCA to raw countss (word counts, music play
counts, movie viewing counts, etc.)。
The reason for this is that such counts often contain large outliers. As we know, PCA looks for linear correlations within the features.
Correlation and variance statistics are very sensitive to large outliers; a single large number could change the statistics a lot. So, it is a good idea to first trim the data of large values (“Frequency-Based Filtering”), or apply a scaling transform like tf-idf (Chapter 4) or the log transform (“Log Transformation”).
Application of PCA
- anomaly detection of time series
个人理解:通过特征值下降趋势来判断异常点位置; - 利用PCA寻找common factors in the input ???
- 利用ZCA对image做preprocessing,使得image 各个pixel之间没有linear dependency。在image task中,并非必须,加上ZCA只是使得convergence 更快。
- PCA,ZCA并不一定对所有data都有用。
whitening and ZCA whitening
- Whitening
Whitening目的:(i)特征之间相关性较低;(ii)所有特征具有相同的方差。
- ZCA whitening
通过ZCA可以得到一组linear independence features,features数量与原始data features量相等。