问题:
X=[—x1—⋮—xM—]M×C,Y=[—y1—⋮—yN—]N×C
X = \left[\begin{array}{c}
—x_1— \\
\vdots \\
—x_M—
\end{array}\right]_{M\times C},
\quad
Y = \left[\begin{array}{c}
—y_1— \\
\vdots \\
—y_N—
\end{array}\right]_{N\times C}
X=⎣⎢⎡—x1—⋮—xM—⎦⎥⎤M×C,Y=⎣⎢⎡—y1—⋮—yN—⎦⎥⎤N×C
其中 xi,yi∈RCx_i, y_i \in R^Cxi,yi∈RC.
要求X,YX,YX,Y每一行之间的距离, 即S=[Sij]M×N=[∣∣xi−yj∣∣22]M×NS =\left[S_{ij}\right]_{M\times N}= \left[||x_i-y_j||_2^2\right]_{M\times N}S=[Sij]M×N=[∣∣xi−yj∣∣22]M×N.
易知
Sij=∣∣xi−yj∣∣22=(xi−yj)⊤(xi−yj)=xi⊤xi−2xi⊤yj+yj⊤yj
S_{ij} = ||x_i-y_j||_2^2 = (x_i-y_j)^\top(x_i-y_j) = x_i^\top x_i - 2x_i^\top y_j + y_j^\top y_j
Sij=∣∣xi−yj∣∣22=(xi−yj)⊤(xi−yj)=xi⊤xi−2xi⊤yj+yj⊤yj
def pairwise_l2_distances(X, Y):
"""
A fast, vectorized way to compute pairwise l2 distances between rows in `X`
and `Y`.
Notes
-----
An entry of the pairwise Euclidean distance matrix for two vectors is
.. math::
d[i, j] &= \sqrt{(x_i - y_i) @ (x_i - y_i)} \\\\
&= \sqrt{sum (x_i - y_j)^2} \\\\
&= \sqrt{sum (x_i)^2 - 2 x_i y_j + (y_j)^2}
The code below computes the the third line using numpy broadcasting
fanciness to avoid any for loops.
Parameters
----------
X : :py:class:`ndarray <numpy.ndarray>` of shape `(N, C)`
Collection of `N` input vectors
Y : :py:class:`ndarray <numpy.ndarray>` of shape `(M, C)`
Collection of `M` input vectors. If None, assume `Y` = `X`. Default is
None.
Returns
-------
dists : :py:class:`ndarray <numpy.ndarray>` of shape `(N, M)`
Pairwise distance matrix. Entry (i, j) contains the `L2` distance between
:math:`x_i` and :math:`y_j`.
"""
D = -2 * X @ Y.T + np.sum(Y ** 2, axis=1) + np.sum(X ** 2, axis=1)[:, np.newaxis]
D[D < 0] = 0 # clip any value less than 0 (a result of numerical imprecision)
return np.sqrt(D)

本文介绍了一种快速、矢量化的方法来计算两个矩阵X和Y中每一行之间的L2距离,使用numpy库实现,避免了循环,提高了计算效率。
559

被折叠的 条评论
为什么被折叠?



