sklearn.datasets
.fetch_mldata¶
Warning
DEPRECATED
-
sklearn.datasets.
fetch_mldata
(dataname, target_name=’label’, data_name=’data’, transpose_data=True, data_home=None)[source]¶ DEPRECATED: fetch_mldata was deprecated in version 0.20 and will be removed in version 0.22. Please use fetch_openml.
Fetch an mldata.org data set
mldata.org is no longer operational.
If the file does not exist yet, it is downloaded from mldata.org .
mldata.org does not have an enforced convention for storing data or naming the columns in a data set. The default behavior of this function works well with the most common cases:
- data values are stored in the column ‘data’, and target values in the column ‘label’
- alternatively, the first column stores target values, and the second data values
- the data array is stored as
n_features x n_samples
, and thus needs to be transposed to match thesklearn
standard
Keyword arguments allow to adapt these defaults to specific data sets (see parameters
target_name
,data_name
,transpose_data
, and the examples below).mldata.org data sets may have multiple columns, which are stored in the Bunch object with their original name.
Deprecated since version 0.20: Will be removed in version 0.22
Parameters: - dataname : str
Name of the data set on mldata.org, e.g.: “leukemia”, “Whistler Daily Snowfall”, etc. The raw name is automatically converted to a mldata.org URL .
- target_name : optional, default: ‘label’
Name or index of the column containing the target values.
- data_name : optional, default: ‘data’
Name or index of the column containing the data.
- transpose_data : optional, default: True
If True, transpose the downloaded data array.
- data_home : optional, default: None
Specify another download and cache folder for the data sets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.
Returns: - data : Bunch
Dictionary-like object, the interesting attributes are: ‘data’, the data to learn, ‘target’, the classification labels, ‘DESCR’, the full description of the dataset, and ‘COL_NAMES’, the original names of the dataset columns.