make_gaussian_quantiles#
- sklearn.datasets.make_gaussian_quantiles(*, mean=None, cov=1.0, n_samples=100, n_features=2, n_classes=3, shuffle=True, random_state=None)[source]#
Generate isotropic Gaussian and label samples by quantile.
This classification dataset is constructed by taking a multi-dimensional standard normal distribution and defining classes separated by nested concentric multi-dimensional spheres such that roughly equal numbers of samples are in each class (quantiles of the
distribution).For an example of usage, see Plot randomly generated classification dataset.
Read more in the User Guide.
- Parameters:
- meanarray-like of shape (n_features,), default=None
The mean of the multi-dimensional normal distribution. If None then use the origin (0, 0, …).
- covfloat, default=1.0
The covariance matrix will be this value times the unit matrix. This dataset only produces symmetric normal distributions.
- n_samplesint, default=100
The total number of points equally divided among classes.
- n_featuresint, default=2
The number of features for each sample.
- n_classesint, default=3
The number of classes.
- shufflebool, default=True
Shuffle the samples.
- random_stateint, RandomState instance or None, default=None
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.
- Returns:
- Xndarray of shape (n_samples, n_features)
The generated samples.
- yndarray of shape (n_samples,)
The integer labels for quantile membership of each sample.
Notes
The dataset is from Zhu et al [1].
References
[1]Zhu, H. Zou, S. Rosset, T. Hastie, “Multi-class AdaBoost”, 2009.
Examples
>>> from sklearn.datasets import make_gaussian_quantiles >>> X, y = make_gaussian_quantiles(random_state=42) >>> X.shape (100, 2) >>> y.shape (100,) >>> list(y[:5]) [np.int64(2), np.int64(0), np.int64(1), np.int64(0), np.int64(2)]
Gallery examples#
Plot randomly generated classification dataset
Multi-class AdaBoosted Decision Trees