Fork me on GitHub


sklearn.metrics.pairwise.euclidean_distances(X, Y=None, Y_norm_squared=None, squared=False)

Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors.

For efficiency reasons, the euclidean distance between a pair of row vector x and y is computed as:

dist(x, y) = sqrt(dot(x, x) - 2 * dot(x, y) + dot(y, y))

This formulation has two advantages over other ways of computing distances. First, it is computationally efficient when dealing with sparse data. Second, if x varies but y remains unchanged, then the right-most dot product dot(y, y) can be pre-computed.

However, this is not the most precise way of doing this computation, and the distance matrix returned by this function may not be exactly symmetric as required by, e.g., scipy.spatial.distance functions.


X : {array-like, sparse matrix}, shape (n_samples_1, n_features)

Y : {array-like, sparse matrix}, shape (n_samples_2, n_features)

Y_norm_squared : array-like, shape (n_samples_2, ), optional

Pre-computed dot-products of vectors in Y (e.g., (Y**2).sum(axis=1))

squared : boolean, optional

Return squared Euclidean distances.


distances : {array, sparse matrix}, shape (n_samples_1, n_samples_2)

See also

distances betweens pairs of elements of X and Y.


>>> from sklearn.metrics.pairwise import euclidean_distances
>>> X = [[0, 1], [1, 1]]
>>> # distance between rows of X
>>> euclidean_distances(X, X)
array([[ 0.,  1.],
       [ 1.,  0.]])
>>> # get distance to origin
>>> euclidean_distances(X, [[0, 0]])
array([[ 1.        ],
       [ 1.41421356]])