.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/manifold/plot_mds.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_manifold_plot_mds.py: ========================= Multi-dimensional scaling ========================= An illustration of the metric and non-metric MDS on generated noisy data. .. GENERATED FROM PYTHON SOURCE LINES 9-13 .. code-block:: Python # Authors: The scikit-learn developers # SPDX-License-Identifier: BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 14-18 Dataset preparation ------------------- We start by uniformly generating 20 points in a 2D space. .. GENERATED FROM PYTHON SOURCE LINES 18-37 .. code-block:: Python import numpy as np from matplotlib import pyplot as plt from matplotlib.collections import LineCollection from sklearn import manifold from sklearn.decomposition import PCA from sklearn.metrics import euclidean_distances # Generate the data EPSILON = np.finfo(np.float32).eps n_samples = 20 rng = np.random.RandomState(seed=3) X_true = rng.randint(0, 20, 2 * n_samples).astype(float) X_true = X_true.reshape((n_samples, 2)) # Center the data X_true -= X_true.mean() .. GENERATED FROM PYTHON SOURCE LINES 38-41 Now we compute pairwise distances between all points and add a small amount of noise to the distance matrix. We make sure to keep the noisy distance matrix symmetric. .. GENERATED FROM PYTHON SOURCE LINES 41-51 .. code-block:: Python # Compute pairwise Euclidean distances distances = euclidean_distances(X_true) # Add noise to the distances noise = rng.rand(n_samples, n_samples) noise = noise + noise.T np.fill_diagonal(noise, 0) distances += noise .. GENERATED FROM PYTHON SOURCE LINES 52-53 Here we compute metric and non-metric MDS of the noisy distance matrix. .. GENERATED FROM PYTHON SOURCE LINES 53-77 .. code-block:: Python mds = manifold.MDS( n_components=2, max_iter=3000, eps=1e-9, n_init=1, random_state=42, dissimilarity="precomputed", n_jobs=1, ) X_mds = mds.fit(distances).embedding_ nmds = manifold.MDS( n_components=2, metric=False, max_iter=3000, eps=1e-12, dissimilarity="precomputed", random_state=42, n_jobs=1, n_init=1, ) X_nmds = nmds.fit_transform(distances) .. GENERATED FROM PYTHON SOURCE LINES 78-79 Rescaling the non-metric MDS solution to match the spread of the original data. .. GENERATED FROM PYTHON SOURCE LINES 79-82 .. code-block:: Python X_nmds *= np.sqrt((X_true**2).sum()) / np.sqrt((X_nmds**2).sum()) .. GENERATED FROM PYTHON SOURCE LINES 83-86 To make the visual comparisons easier, we rotate the original data and both MDS solutions to their PCA axes. And flip horizontal and vertical MDS axes, if needed, to match the original data orientation. .. GENERATED FROM PYTHON SOURCE LINES 86-100 .. code-block:: Python # Rotate the data pca = PCA(n_components=2) X_true = pca.fit_transform(X_true) X_mds = pca.fit_transform(X_mds) X_nmds = pca.fit_transform(X_nmds) # Align the sign of PCs for i in [0, 1]: if np.corrcoef(X_mds[:, i], X_true[:, i])[0, 1] < 0: X_mds[:, i] *= -1 if np.corrcoef(X_nmds[:, i], X_true[:, i])[0, 1] < 0: X_nmds[:, i] *= -1 .. GENERATED FROM PYTHON SOURCE LINES 101-102 Finally, we plot the original data and both MDS reconstructions. .. GENERATED FROM PYTHON SOURCE LINES 102-130 .. code-block:: Python fig = plt.figure(1) ax = plt.axes([0.0, 0.0, 1.0, 1.0]) s = 100 plt.scatter(X_true[:, 0], X_true[:, 1], color="navy", s=s, lw=0, label="True Position") plt.scatter(X_mds[:, 0], X_mds[:, 1], color="turquoise", s=s, lw=0, label="MDS") plt.scatter(X_nmds[:, 0], X_nmds[:, 1], color="darkorange", s=s, lw=0, label="NMDS") plt.legend(scatterpoints=1, loc="best", shadow=False) # Plot the edges start_idx, end_idx = X_mds.nonzero() # a sequence of (*line0*, *line1*, *line2*), where:: # linen = (x0, y0), (x1, y1), ... (xm, ym) segments = [ [X_true[i, :], X_true[j, :]] for i in range(len(X_true)) for j in range(len(X_true)) ] edges = distances.max() / (distances + EPSILON) * 100 np.fill_diagonal(edges, 0) edges = np.abs(edges) lc = LineCollection( segments, zorder=0, cmap=plt.cm.Blues, norm=plt.Normalize(0, edges.max()) ) lc.set_array(edges.flatten()) lc.set_linewidths(np.full(len(segments), 0.5)) ax.add_collection(lc) plt.show() .. image-sg:: /auto_examples/manifold/images/sphx_glr_plot_mds_001.png :alt: plot mds :srcset: /auto_examples/manifold/images/sphx_glr_plot_mds_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.251 seconds) .. _sphx_glr_download_auto_examples_manifold_plot_mds.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/manifold/plot_mds.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/index.html?path=auto_examples/manifold/plot_mds.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_mds.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_mds.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_mds.zip ` .. include:: plot_mds.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_