Tip
before learning about the following example , we need to have the notion ofthe principle of PCA
- the principle of PCA in english refers to https://en.wikipedia.org/wiki/Principal_component_analysis
- the principle of PCA in chinese refers to http://blog.codinglabs.org/articles/pca-tutorial.html
this example of data set reders to https://archive.ics.uci.edu/ml/datasets/Iris
Let’s use the Iris dataset to understand how to use PCA efficiently in reducing the dimension of the dataset. The Iris dataset contains measurements for 150 iris flowers from three different species. The three classes in the Iris dataset are as follows:
- Iris Setosa
- Iris Versicolor
- Iris Virginica
The following are the four features in the Iris dataset:
- The sepal length in cm
- The sepal width in cm
- The petal length in cm
- The petal width in cm
show original data by scatter plots: