sklearn.metrics.pairwise.euclidean_distances

sklearn.metrics.pairwise.euclidean_distances(X, Y=None, *, Y_norm_squared=None, squared=False, X_norm_squared=None)[source]

Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors.

For efficiency reasons, the euclidean distance between a pair of row vector x and y is computed as:

dist(x, y) = sqrt(dot(x, x) - 2 * dot(x, y) + dot(y, y))

This formulation has two advantages over other ways of computing distances. First, it is computationally efficient when dealing with sparse data. Second, if one argument varies but the other remains unchanged, then dot(x, x) and/or dot(y, y) can be pre-computed.

However, this is not the most precise way of doing this computation, because this equation potentially suffers from “catastrophic cancellation”. Also, the distance matrix returned by this function may not be exactly symmetric as required by, e.g., scipy.spatial.distance functions.

Read more in the User Guide.

Parameters
X{array-like, sparse matrix} of shape (n_samples_X, n_features)
Y{array-like, sparse matrix} of shape (n_samples_Y, n_features), default=None

If None, uses Y=X.

Y_norm_squaredarray-like of shape (n_samples_Y,) or (n_samples_Y, 1) or (1, n_samples_Y), default=None

Pre-computed dot-products of vectors in Y (e.g., (Y**2).sum(axis=1)) May be ignored in some cases, see the note below.

squaredbool, default=False

Return squared Euclidean distances.

X_norm_squaredarray-like of shape (n_samples_X,) or (n_samples_X, 1) or (1, n_samples_X), default=None

Pre-computed dot-products of vectors in X (e.g., (X**2).sum(axis=1)) May be ignored in some cases, see the note below.

Returns
distancesndarray of shape (n_samples_X, n_samples_Y)

See also

paired_distances

Distances betweens pairs of elements of X and Y.

Notes

To achieve better accuracy, X_norm_squared and Y_norm_squared may be unused if they are passed as float32.

Examples

>>> from sklearn.metrics.pairwise import euclidean_distances
>>> X = [[0, 1], [1, 1]]
>>> # distance between rows of X
>>> euclidean_distances(X, X)
array([[0., 1.],
       [1., 0.]])
>>> # get distance to origin
>>> euclidean_distances(X, [[0, 0]])
array([[1.        ],
       [1.41421356]])