sklearn.datasets#
Utilities to load popular datasets and artificial data generators.
User guide. See the Dataset loading utilities section for further details.
Loaders#
Delete all the content of the data home cache. |
|
Dump the dataset in svmlight / libsvm file format. |
|
Load the filenames and data from the 20 newsgroups dataset (classification). |
|
Load and vectorize the 20 newsgroups dataset (classification). |
|
Load the California housing dataset (regression). |
|
Load the covertype dataset (classification). |
|
Fetch a file from the web if not already present in the local folder. |
|
Load the kddcup99 dataset (classification). |
|
Load the Labeled Faces in the Wild (LFW) pairs dataset (classification). |
|
Load the Labeled Faces in the Wild (LFW) people dataset (classification). |
|
Load the Olivetti faces data-set from AT&T (classification). |
|
Fetch dataset from openml by name or dataset id. |
|
Load the RCV1 multilabel dataset (classification). |
|
Loader for species distribution dataset from Phillips et. |
|
Return the path of the scikit-learn data directory. |
|
Load and return the breast cancer wisconsin dataset (classification). |
|
Load and return the diabetes dataset (regression). |
|
Load and return the digits dataset (classification). |
|
Load text files with categories as subfolder names. |
|
Load and return the iris dataset (classification). |
|
Load and return the physical exercise Linnerud dataset. |
|
Load the numpy array of a single sample image. |
|
Load sample images for image manipulation. |
|
Load datasets in the svmlight / libsvm format into sparse CSR matrix. |
|
Load dataset from multiple files in SVMlight format. |
|
Load and return the wine dataset (classification). |
Sample generators#
Generate a constant block diagonal structure array for biclustering. |
|
Generate isotropic Gaussian blobs for clustering. |
|
Generate an array with block checkerboard structure for biclustering. |
|
Make a large circle containing a smaller circle in 2d. |
|
Generate a random n-class classification problem. |
|
Generate the "Friedman #1" regression problem. |
|
Generate the "Friedman #2" regression problem. |
|
Generate the "Friedman #3" regression problem. |
|
Generate isotropic Gaussian and label samples by quantile. |
|
Generate data for binary classification used in Hastie et al. 2009, Example 10.2. |
|
Generate a mostly low rank matrix with bell-shaped singular values. |
|
Make two interleaving half circles. |
|
Generate a random multilabel classification problem. |
|
Generate a random regression problem. |
|
Generate an S curve dataset. |
|
Generate a signal as a sparse combination of dictionary elements. |
|
Generate a sparse symmetric definite positive matrix. |
|
Generate a random regression problem with sparse uncorrelated design. |
|
Generate a random symmetric, positive-definite matrix. |
|
Generate a swiss roll dataset. |