Multidimensional Scaling (MDS)

最新推荐文章于 2024-06-06 08:00:00 发布

seamanj

最新推荐文章于 2024-06-06 08:00:00 发布

阅读量660

点赞数

CC 4.0 BY-SA版权

分类专栏：数学积累

本文链接：https://blog.youkuaiyun.com/seamanj/article/details/51869863

数学积累专栏收录该内容

51 篇文章

订阅专栏

本文深入探讨了多维缩放(MDS)的基本原理和技术细节。通过数学推导解释了如何从给定的距离矩阵出发，找到一组低维度的点，使得这些点之间的距离能够最好地逼近原始的距离矩阵。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

MDS aims to embed data in a lower dimensional space in such a way that pair-wise distances between data points are preserved.

Say we have N points $x_i \in R^n$ for $i \in [1, N]$ , let $X = [x_1, x_2, \cdots, x_N]$ , we don’t know the postion of $x_i$ . We are only supplied with the pair-wise Euclidean distances among these points. Now the objection is to find out N points $y_i \in R^k, k < n$ , let $Y = [y_1, y_2, \cdots, y_N]$ , such that the distance in pairs of X is the same as these of Y.

Given the distance matrix $D^X$ , each element of $D^X$ can be written as:
$(D^X_{ij})^2 = (x_i-x_j)^T(x_i-x_j)=\lVert x_i \rVert^2-2x^T_ix_j+\lVert x_j \rVert^2$
we can easily see that
$D^X = Z-2X^TX+Z^T$

Here, $Z = ze^T$ and $z = [\lVert x_1 \rVert^2 \lVert x_2 \rVert^2 \cdots \lVert x_N \rVert^2]^T$ . Therefore Z takes the form

Z = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ∥ x 1 ∥ 2 ∥ x 1 ∥ 2 ⋮ ∥ x 1 ∥ 2 ∥ x 1 ∥ 2 ∥ x 1 ∥ 2 ⋮ ∥ x 1 ∥ 2 \dots \dots ⋱ \dots ∥ x 1 ∥ 2 ∥ x 1 ∥ 2 ⋮ ∥ x 1 ∥ 2 ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥

$Z = \begin{bmatrix} \lVert x_1 \rVert^2 &\lVert x_1 \rVert^2&\cdots&\lVert x_1 \rVert^2\\ \lVert x_1 \rVert^2 &\lVert x_1 \rVert^2&\cdots&\lVert x_1 \rVert^2\\ \vdots&\vdots& \ddots &\vdots\\ \lVert x_1 \rVert^2 &\lVert x_1 \rVert^2&\cdots&\lVert x_1 \rVert^2\\ \end{bmatrix}$

Now, let’s translate the mean of the set of hypothetical point set $X$ to the origin. Note that this operation does not change the Euclidean distance between any pairs of points.

For better understanding, we introduce $\frac{1}{N}Aee^T$ and $\frac{1}{N}ee^TA$ . Here, A is a N-by-N matrix which taks the form:

A = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ A 11 A 21 ⋮ A N 1 A 12 A 22 ⋮ A N 2 \dots \dots ⋱ \dots A 1 N A 2 N ⋮ A N N ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥

$A = \begin{bmatrix} A_{11} &A_{12} &\cdots&A_{1N}\\ A_{21} &A_{22} &\cdots&A_{2N}\\ \vdots&\vdots& \ddots &\vdots\\ A_{N1} &A_{N2} &\cdots&A_{NN}\\ \end{bmatrix}$

Hence,

1 N A e e T = 1 N ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ A 11 A 21 ⋮ A N 1 A 12 A 22 ⋮ A N 2 \dots \dots ⋱ \dots A 1 N A 2 N ⋮ A N N ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ 11 ⋮ 1 11 ⋮ 1 \dots \dots ⋱ \dots 11 ⋮ 1 ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ 1 N \sum j = 1 N A 1 j 1 N \sum j = 1 N A 2 j ⋮ 1 N \sum j = 1 N A N j 1 N \sum j = 1 N A 1 j 1 N \sum j = 1 N A 2 j ⋮ 1 N \sum j = 1 N A N j \dots \dots ⋱ \dots 1 N \sum j = 1 N A 1 j 1 N \sum j = 1 N A 2 j ⋮ 1 N \sum j = 1 N A N j ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ mean of first row of A mean of second row of A ⋮ mean of Nth row of A mean of first row of A mean of second row of A ⋮ mean of Nth row of A \dots \dots ⋱ \dots mean of first row of A mean of second row of A ⋮ mean of Nth row of A ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥

$\frac{1}{N}Aee^T=\frac{1}{N}\begin{bmatrix} A_{11} &A_{12} &\cdots&A_{1N}\\ A_{21} &A_{22} &\cdots&A_{2N}\\ \vdots&\vdots& \ddots &\vdots\\ A_{N1} &A_{N2} &\cdots&A_{NN}\\ \end{bmatrix}\begin{bmatrix} 1 &1 &\cdots&1\\ 1 &1 &\cdots&1\\ \vdots&\vdots& \ddots &\vdots\\ 1 &1 &\cdots&1\\ \end{bmatrix}=\begin{bmatrix} \frac{1}{N}\sum_{j=1}^N A_{1j} &\frac{1}{N}\sum_{j=1}^N A_{1j} &\cdots&\frac{1}{N}\sum_{j=1}^N A_{1j}\\ \frac{1}{N}\sum_{j=1}^N A_{2j} &\frac{1}{N}\sum_{j=1}^N A_{2j} &\cdots&\frac{1}{N}\sum_{j=1}^N A_{2j}\\ \vdots&\vdots& \ddots &\vdots\\ \frac{1}{N}\sum_{j=1}^N A_{Nj} &\frac{1}{N}\sum_{j=1}^N A_{Nj} &\cdots&\frac{1}{N}\sum_{j=1}^N A_{Nj}\\ \end{bmatrix}\\ = \begin{bmatrix} \text{mean of first row of A} &\text{mean of first row of A} &\cdots&\text{mean of first row of A}\\ \text{mean of second row of A} &\text{mean of second row of A} &\cdots&\text{mean of second row of A}\\ \vdots&\vdots& \ddots &\vdots\\ \text{mean of Nth row of A} &\text{mean of Nth row of A} &\cdots&\text{mean of Nth row of A}\\ \end{bmatrix}$
similiarly,

1 N e e T A = 1 N ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ 11 ⋮ 1 11 ⋮ 1 \dots \dots ⋱ \dots 11 ⋮ 1 ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ A 11 A 21 ⋮ A N 1 A 12 A 22 ⋮ A N 2 \dots \dots ⋱ \dots A 1 N A 2 N ⋮ A N N ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ 1 N \sum i = 1 N A i 1 1 N \sum i = 1 N A i 1 ⋮ 1 N \sum i = 1 N A i 1 1 N \sum i = 1 N A i 2 1 N \sum i = 1 N A i 2 ⋮ 1 N \sum i = 1 N A i 2 \dots \dots ⋱ \dots 1 N \sum i = 1 N A i N 1 N \sum i = 1 N A i N ⋮ 1 N \sum i = 1 N A i N ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ mean of first column of A mean of first column of A ⋮ mean of first column of A mean of second column of A mean of second column of A ⋮ mean of second column of A \dots \dots ⋱ \dots mean of Nth column of A mean of Nth column of A ⋮ mean of Nth column of A ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥

$\frac{1}{N}ee^TA=\frac{1}{N}\begin{bmatrix} 1 &1 &\cdots&1\\ 1 &1 &\cdots&1\\ \vdots&\vdots& \ddots &\vdots\\ 1 &1 &\cdots&1\\ \end{bmatrix}\begin{bmatrix} A_{11} &A_{12} &\cdots&A_{1N}\\ A_{21} &A_{22} &\cdots&A_{2N}\\ \vdots&\vdots& \ddots &\vdots\\ A_{N1} &A_{N2} &\cdots&A_{NN}\\ \end{bmatrix}=\begin{bmatrix} \frac{1}{N}\sum_{i=1}^N A_{i1} &\frac{1}{N}\sum_{i=1}^N A_{i2} &\cdots&\frac{1}{N}\sum_{i=1}^N A_{iN}\\ \frac{1}{N}\sum_{i=1}^N A_{i1} &\frac{1}{N}\sum_{i=1}^N A_{i2} &\cdots&\frac{1}{N}\sum_{i=1}^N A_{iN}\\ \vdots&\vdots& \ddots &\vdots\\ \frac{1}{N}\sum_{i=1}^N A_{i1} &\frac{1}{N}\sum_{i=1}^N A_{i2}&\cdots&\frac{1}{N}\sum_{i=1}^N A_{iN}\\ \end{bmatrix}\\ = \begin{bmatrix} \text{mean of first column of A} &\text{mean of second column of A} &\cdots&\text{mean of Nth column of A}\\ \text{mean of first column of A} &\text{mean of second column of A} &\cdots&\text{mean of Nth column of A}\\ \vdots&\vdots& \ddots &\vdots\\ \text{mean of first column of A} &\text{mean of second column of A} &\cdots&\text{mean of Nth column of A}\\ \end{bmatrix}$

The centering matrix is defined as:

H = I N - 1 N e e T

$H = I_N - \frac{1}{N}ee^T$
Let’s now apply double centering to

DX $D^X$ to get

A X = H D X H = (I N - 1 N e e T) (Z - 2 X T X + Z T) (I N - 1 N e e T) = (I N - 1 N e e T) Z (I N - 1 N e e T) - 2 (I N - 1 N e e T) X T X (I N - 1 N e e T) + (I N - 1 N e e T) Z T (I N - 1 N e e T) = - 2 (I N - 1 N e e T) X T X (I N - 1 N e e T) = - 2 (X (I N - 1 N e e T)) T X (I N - 1 N e e T) = - 2 X ~ T X ~

$A^X = HD^XH=( I_N - \frac{1}{N}ee^T)(Z-2X^TX+Z^T)( I_N - \frac{1}{N}ee^T)\\=( I_N - \frac{1}{N}ee^T)Z( I_N - \frac{1}{N}ee^T)-2( I_N - \frac{1}{N}ee^T)X^TX( I_N - \frac{1}{N}ee^T)+( I_N - \frac{1}{N}ee^T)Z^T( I_N - \frac{1}{N}ee^T)\\=-2( I_N - \frac{1}{N}ee^T)X^TX( I_N - \frac{1}{N}ee^T)=-2(X( I_N - \frac{1}{N}ee^T))^TX( I_N - \frac{1}{N}ee^T)=-2\tilde X^T\tilde X$
where

X~=X(IN−1NeeT) $\tilde X = X( I_N - \frac{1}{N}ee^T)$

B X = - 1 2 A X = - 1 2 H D X H = X ~ T X ~

$B^X = -\frac{1}{2}A^X= -\frac{1}{2}HD^XH=\tilde X^T\tilde X$

Remember, the task was to find a concrete set of N points $Y$ in k dimensions so that the pairwise Euclidean distances betwwen all the pairs in the concrete set $Y$ is a close approximation to the pair-wise distances given to us in the matrix $D^X$ i.e. we want to find $D^Y$ such that

D Y = a r g m i n ∥ D X - D Y ∥ 2 F

$D^Y = argmin\lVert D^X - D^Y\rVert_F^2$
Note that after applying the “double centering” operation to both

X $X$ and

Y $Y$ , equation above yields

B Y = a r g m i n ∥ B X - B Y ∥ 2 F = D Y = ∥ X ~ T X ~ - Y ~ T Y ~ ∥ 2 F

$B^Y = argmin\lVert B^X - B^Y\rVert_F^2=D^Y = \lVert \tilde X^T\tilde X - \tilde Y^T\tilde Y\rVert_F^2$

The above equation is a well known optimization problem that can be solved via Singular Value Decomposition(SVD) of $B^X$ .

B X \approx U D U T = (U D 1 2) (D 1 2 U T) = Y T Y ~

$B^X \approx UDU^T=(UD^{\frac{1}{2}})(D^{\frac{1}{2}}U^T)=Y^T\tilde Y$

Here, $U$ is N by k matrix and $D$ is k by k diagonal matrix with k largest singular values on the diagonal and $\tilde Y = D^{\frac{1}{2}}U^T$ is k by N matrix. Finally, we get N embedding points in k dimension as the column vectors of $\tilde Y$