sklearn.datasets.fetch_covtype

sklearn.datasets.fetch_covtype(*, data_home=None, download_if_missing=True, random_state=None, shuffle=False, return_X_y=False)[source]

Load the covertype dataset (classification).

Download it if necessary.

Classes

7

Samples total

581012

Dimensionality

54

Features

int

Read more in the User Guide.

Parameters
data_homestring, optional

Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.

download_if_missingboolean, default=True

If False, raise a IOError if the data is not locally available instead of trying to download the data from the source site.

random_stateint, RandomState instance, default=None

Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls. See Glossary.

shufflebool, default=False

Whether to shuffle dataset.

return_X_yboolean, default=False.

If True, returns (data.data, data.target) instead of a Bunch object.

New in version 0.20.

Returns
datasetBunch

Dictionary-like object, with the following attributes.

datanumpy array of shape (581012, 54)

Each row corresponds to the 54 features in the dataset.

targetnumpy array of shape (581012,)

Each value corresponds to one of the 7 forest covertypes with values ranging between 1 to 7.

DESCRstr

Description of the forest covertype dataset.

(data, target)tuple if return_X_y is True

New in version 0.20.