sklearn.datasets#
Utilities to load popular datasets and artificial data generators.
User guide. See the Dataset loading utilities section for further details.
Loaders#
Delete all the content of the data home cache.  | 
|
Dump the dataset in svmlight / libsvm file format.  | 
|
Load the filenames and data from the 20 newsgroups dataset (classification).  | 
|
Load and vectorize the 20 newsgroups dataset (classification).  | 
|
Load the California housing dataset (regression).  | 
|
Load the covertype dataset (classification).  | 
|
Fetch a file from the web if not already present in the local folder.  | 
|
Load the kddcup99 dataset (classification).  | 
|
Load the Labeled Faces in the Wild (LFW) pairs dataset (classification).  | 
|
Load the Labeled Faces in the Wild (LFW) people dataset (classification).  | 
|
Load the Olivetti faces data-set from AT&T (classification).  | 
|
Fetch dataset from openml by name or dataset id.  | 
|
Load the RCV1 multilabel dataset (classification).  | 
|
Loader for species distribution dataset from Phillips et.  | 
|
Return the path of the scikit-learn data directory.  | 
|
Load and return the breast cancer Wisconsin dataset (classification).  | 
|
Load and return the diabetes dataset (regression).  | 
|
Load and return the digits dataset (classification).  | 
|
Load text files with categories as subfolder names.  | 
|
Load and return the iris dataset (classification).  | 
|
Load and return the physical exercise Linnerud dataset.  | 
|
Load the numpy array of a single sample image.  | 
|
Load sample images for image manipulation.  | 
|
Load datasets in the svmlight / libsvm format into sparse CSR matrix.  | 
|
Load dataset from multiple files in SVMlight format.  | 
|
Load and return the wine dataset (classification).  | 
Sample generators#
Generate a constant block diagonal structure array for biclustering.  | 
|
Generate isotropic Gaussian blobs for clustering.  | 
|
Generate an array with block checkerboard structure for biclustering.  | 
|
Make a large circle containing a smaller circle in 2d.  | 
|
Generate a random n-class classification problem.  | 
|
Generate the "Friedman #1" regression problem.  | 
|
Generate the "Friedman #2" regression problem.  | 
|
Generate the "Friedman #3" regression problem.  | 
|
Generate isotropic Gaussian and label samples by quantile.  | 
|
Generate data for binary classification used in Hastie et al. 2009, Example 10.2.  | 
|
Generate a mostly low rank matrix with bell-shaped singular values.  | 
|
Make two interleaving half circles.  | 
|
Generate a random multilabel classification problem.  | 
|
Generate a random regression problem.  | 
|
Generate an S curve dataset.  | 
|
Generate a signal as a sparse combination of dictionary elements.  | 
|
Generate a sparse symmetric definite positive matrix.  | 
|
Generate a random regression problem with sparse uncorrelated design.  | 
|
Generate a random symmetric, positive-definite matrix.  | 
|
Generate a swiss roll dataset.  |