Manifold Learning Algorithm
Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high.
High-dimensional datasets can be very difficult to visualize. While data in two or three dimensions can be plotted to show the inherent structure of the data, equivalent high-dimensional plots are much less intuitive. To aid visualization of the structure of a dataset, the dimension must be reduced in some way.
Some Manifold Learning Algorithm
Manifold learning can be divided into linear and nonlinear methods.
-
Linear methods, which have long been part of the statistician’s toolbox for analyzing multivariate data, include principal component analysis (PCA) and multidimensional scaling (MDS).
-
Recently, there has been a flurry of research activity on nonlinear manifold learning, which includes Isomap, local linear embedding, Laplacian eigenmaps, Hessian eigenmaps, and diffusion maps.
Some of these techniques are nonlinear generalizations of the linear methods.
Locally Linear Embedding(LLE)
Locally linear embedding (LLE) seeks a lower-dimensional projection of the data which preserves distances within local neighborhoods. It can be thought of as a series of local Principal Component Analyses which are globally compared to find the best non-linear embedding.
LLE principle
-
LLE first assumes that the data is linear in a small part, that is to say, a certain data can be expressed linearly by several samples in its neighborhood.
-
For example, for sample Xi
-
After we reduce the dimension by LLE, we hope that these samples still keep such a linear relationship
This is probably a bit like the limit idea in mathematics. For example, when we calculate the derivative of a certain point in a function, we will assume that the line between the point and a point very close to the point is a straight line, and then we will calculate the derivative.
The Complexity of LLE
-
Nearest Neighbors Search.
-
Weight Matrix Construction.
-
Partial Eigenvalue Decomposition.
Advantages
- We can learn any dimensional locally linear low dimensional manifolds
- The algorithm is reduced to sparse matrix eigen decomposition, so the computational complexity is relatively small and the implementation is easy.
Disadvantages
- The manifold learned by the algorithm can only be unclosed, and the sample set is dense and uniform.
- The algorithm is sensitive to the selection of the number of nearest neighbor samples, and different nearest neighbor numbers have a great impact on the final dimension reduction results.
The application of LLE
-
Signal Processing: It is used for noise reduction of ECG signal mixed
with white Gaussian noise and feature extraction of sinusoidal signal
mixed with weak impact -
Text Classification: Used to train text data sets for training to
obtain classifiers -
Image Recognition: Extracting intrinsic feature structure from high -
dimensional image data -
Face Recognition: The low - dimensional manifolds embedded in the
high - dimensional space are found and the high - dimensional face
data are reduced
The Implement of some Manifold Learning Algorithm
Principal Component Analysis(PCA)
Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space.
This is an implementation of dimension reduction through PCA.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.datasets import make_blobs
X, y = make_blobs(n_samples=5000, n_features=3, centers=[ [0,0,0], [1,1,1], [2,2,2], [3,3,3]],
cluster_std=[0.1, 0.2, 0.2, 0.3], random_state =9)
fig = plt.figure()
ax = Axes3D(fig, rect=[0, 0, 1, 1], elev=30, azim=20)
plt.scatter(X[:, 0], X[:, 1], X[:, 2],marker='o')
plt.show()
from sklearn.decomposition import PCA
pca = PCA(n_components=3)
pca.fit(X)
print (pca.explained_variance_ratio_)
print (pca.explained_variance_)
pca = PCA(n_components=2)
pca.fit(X)
print (pca.explained_variance_ratio_)
print (pca.explained_variance_)
X_new = pca.transform(X)
plt.scatter(X_new[:, 0], X_new[:, 1],marker='o')
plt.show()
We output the variance of each dimension before and after dimension reduction, and then compare it.
In the dimension with the minimum variance, we neglect it to a certain extent, and get the dimension after dimension reduction.
After dimension reduction of multi-dimensional image, we get the following two-dimensional feature image according to the original image features.
Locally Linear Embedding(LLE)
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn import manifold, datasets
from sklearn.utils import check_random_state
n_samples = 500
random_state = check_random_state(0)
p = random_state.rand(n_samples) * (2 * np.pi - 0.55)
t = random_state.rand(n_samples) * np.pi
indices = ((t < (np.pi - (np.pi / 8))) & (t > ((np.pi / 8))))
colors = p[indices]
x, y, z = np.sin(t[indices]) * np.cos(p[indices]), \
np.sin(t[indices]) * np.sin(p[indices]), \
np.cos(t[indices])
fig = plt.figure()
ax = Axes3D(fig, elev=30, azim=-20)
ax.scatter(x, y, z, c=p[indices], marker='o', cmap=plt.cm.rainbow)
plt.show()
train_data = np.array([x, y, z]).T
for index, k in enumerate((5,10,20,30)):
plt.subplot(2,2,index+1)
trans_data = manifold.LocallyLinearEmbedding(n_neighbors = k,
n_components = 2,method='standard').fit_transform(train_data)
plt.scatter(trans_data[:, 0], trans_data[:, 1], marker='o', c=colors)
plt.text(.99, .01, ('LLE: k=%d' % (k)),transform=plt.gca().transAxes, size=10,horizontalalignment='right')
plt.show()
In the same algorithm, the larger the number of k-nearest neighbors, the better the effect of dimensionality reduction visualization.
Of course, there is no free lunch, better dimensionality reduction visualization means more algorithm running time.