.. _loading_other_datasets: Loading other datasets ====================== .. currentmodule:: sklearn.datasets .. _sample_images: Sample images ------------- Scikit-learn also embeds a couple of sample JPEG images published under Creative Commons license by their authors. Those images can be useful to test algorithms and pipelines on 2D data. .. autosummary:: load_sample_images load_sample_image .. plot:: :context: close-figs :scale: 30 :align: right :include-source: False import matplotlib.pyplot as plt from sklearn.datasets import load_sample_image china = load_sample_image("china.jpg") plt.imshow(china) plt.axis('off') plt.tight_layout() plt.show() .. warning:: The default coding of images is based on the ``uint8`` dtype to spare memory. Often machine learning algorithms work best if the input is converted to a floating point representation first. Also, if you plan to use ``matplotlib.pyplpt.imshow``, don't forget to scale to the range 0 - 1 as done in the following example. .. _libsvm_loader: Datasets in svmlight / libsvm format ------------------------------------ scikit-learn includes utility functions for loading datasets in the svmlight / libsvm format. In this format, each line takes the form ``