初识R语言 —— PCA的实现
回顾PCA
在之前的文章(老妪能解PCA)中曾经写过一些自己的PCA的看法,今天尝试用R语言来进行PCA的实现。回顾一下什么是PCA,总结来说就是基于对各个特征之间相关性的分析,从而找到主要成分并选取一定个数的特征向量作为新的基,从而得到样本在以新的基所构成的空间中的映射作为新的样本值,也就达到了降维的目的。
数据描述
这次数据使用的是真实数据,数据的描述如下:
Human body consists of about 70 trillion cells, where each of the cells have DNA molecules called genome (Figure 1). Here, the genome is only a storage unit for genetic information, which needs to be partially copied into a smaller unit called RNA for the actual utility (Figure 1). Each RNA molecule is much smaller than genome and only contains information of a single gene, while genome has genetic information of every genes. Here, the process partially copying genome into RNA is called transcription (Figure 1). After RNAs are transcribed from genome, they subsequently converted into polypeptides (or proteins), which are the actual machineries running cellular processes. This conversion is named as translation to distinguish from transcription (Figure 1).
![]()
Figure 1. Description of transcription process
Cell ne