sklearn.manifold.trustworthiness(X, X_embedded, *, n_neighbors=5, metric='euclidean')[source]

Expresses to what extent the local structure is retained.

The trustworthiness is within [0, 1]. It is defined as

\[T(k) = 1 - \frac{2}{nk (2n - 3k - 1)} \sum^n_{i=1} \sum_{j \in \mathcal{N}_{i}^{k}} \max(0, (r(i, j) - k))\]

where for each sample i, \(\mathcal{N}_{i}^{k}\) are its k nearest neighbors in the output space, and every sample j is its \(r(i, j)\)-th nearest neighbor in the input space. In other words, any unexpected nearest neighbors in the output space are penalised in proportion to their rank in the input space.

Xndarray of shape (n_samples, n_features) or (n_samples, n_samples)

If the metric is ‘precomputed’ X must be a square distance matrix. Otherwise it contains a sample per row.

X_embeddedndarray of shape (n_samples, n_components)

Embedding of the training data in low-dimensional space.

n_neighborsint, default=5

The number of neighbors that will be considered. Should be fewer than n_samples / 2 to ensure the trustworthiness to lies within [0, 1], as mentioned in [1]. An error will be raised otherwise.

metricstr or callable, default=’euclidean’

Which metric to use for computing pairwise distances between samples from the original input space. If metric is ‘precomputed’, X must be a matrix of pairwise distances or squared distances. Otherwise, for a list of available metrics, see the documentation of argument metric in sklearn.pairwise.pairwise_distances and metrics listed in sklearn.metrics.pairwise.PAIRWISE_DISTANCE_FUNCTIONS. Note that the “cosine” metric uses cosine_distances.

New in version 0.20.


Trustworthiness of the low-dimensional embedding.



Jarkko Venna and Samuel Kaski. 2001. Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study. In Proceedings of the International Conference on Artificial Neural Networks (ICANN ‘01). Springer-Verlag, Berlin, Heidelberg, 485-491.


Laurens van der Maaten. Learning a Parametric Embedding by Preserving Local Structure. Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, PMLR 5:384-391, 2009.