This is documentation for an old release of Scikit-learn (version 1.2). Try the latest stable release (version 1.6) or development (unstable) versions.
sklearn.datasets
.make_blobs¶
- sklearn.datasets.make_blobs(n_samples=100, n_features=2, *, centers=None, cluster_std=1.0, center_box=(-10.0, 10.0), shuffle=True, random_state=None, return_centers=False)[source]¶
Generate isotropic Gaussian blobs for clustering.
Read more in the User Guide.
- Parameters:
- n_samplesint or array-like, default=100
If int, it is the total number of points equally divided among clusters. If array-like, each element of the sequence indicates the number of samples per cluster.
Changed in version v0.20: one can now pass an array-like to the
n_samples
parameter- n_featuresint, default=2
The number of features for each sample.
- centersint or ndarray of shape (n_centers, n_features), default=None
The number of centers to generate, or the fixed center locations. If n_samples is an int and centers is None, 3 centers are generated. If n_samples is array-like, centers must be either None or an array of length equal to the length of n_samples.
- cluster_stdfloat or array-like of float, default=1.0
The standard deviation of the clusters.
- center_boxtuple of float (min, max), default=(-10.0, 10.0)
The bounding box for each cluster center when centers are generated at random.
- shufflebool, default=True
Shuffle the samples.
- random_stateint, RandomState instance or None, default=None
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.
- return_centersbool, default=False
If True, then return the centers of each cluster.
New in version 0.23.
- Returns:
- Xndarray of shape (n_samples, n_features)
The generated samples.
- yndarray of shape (n_samples,)
The integer labels for cluster membership of each sample.
- centersndarray of shape (n_centers, n_features)
The centers of each cluster. Only returned if
return_centers=True
.
See also
make_classification
A more intricate variant.
Examples
>>> from sklearn.datasets import make_blobs >>> X, y = make_blobs(n_samples=10, centers=3, n_features=2, ... random_state=0) >>> print(X.shape) (10, 2) >>> y array([0, 0, 1, 0, 2, 2, 2, 1, 1, 0]) >>> X, y = make_blobs(n_samples=[3, 3, 4], centers=None, n_features=2, ... random_state=0) >>> print(X.shape) (10, 2) >>> y array([0, 1, 2, 0, 2, 2, 2, 1, 1, 0])
Examples using sklearn.datasets.make_blobs
¶
data:image/s3,"s3://crabby-images/b7714/b77148f018a98f78adf2cbd73bf95c6c211f644d" alt="Probability Calibration for 3-class classification"
Probability Calibration for 3-class classification
data:image/s3,"s3://crabby-images/f4332/f4332d69426ec67c17a3d9d57faec29e6fb3b859" alt="Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification"
Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification
data:image/s3,"s3://crabby-images/0b6a9/0b6a98d21814819b1344ce2460dba8a59f0a3bb3" alt="Bisecting K-Means and Regular K-Means Performance Comparison"
Bisecting K-Means and Regular K-Means Performance Comparison
data:image/s3,"s3://crabby-images/e1b9a/e1b9a8b33382d17cacb9941b3751cdd1bfd48f68" alt="Comparing different clustering algorithms on toy datasets"
Comparing different clustering algorithms on toy datasets
data:image/s3,"s3://crabby-images/aa7ad/aa7ad04a128c412345a1aecdb25040fd8fc6ec67" alt="Comparing different hierarchical linkage methods on toy datasets"
Comparing different hierarchical linkage methods on toy datasets
data:image/s3,"s3://crabby-images/e6677/e667777288ddc3a9cbcd38497526b9473310a9ad" alt="Comparison of the K-Means and MiniBatchKMeans clustering algorithms"
Comparison of the K-Means and MiniBatchKMeans clustering algorithms
data:image/s3,"s3://crabby-images/a8200/a820004e04417844cbb5ed502febc830d070e0d1" alt="Selecting the number of clusters with silhouette analysis on KMeans clustering"
Selecting the number of clusters with silhouette analysis on KMeans clustering
data:image/s3,"s3://crabby-images/69af4/69af48bd3cb240604f27519687139a40c0496023" alt="Plot multinomial and One-vs-Rest Logistic Regression"
Plot multinomial and One-vs-Rest Logistic Regression
data:image/s3,"s3://crabby-images/5c872/5c8724244d46c03a74bb304fc61912fafb18a519" alt="Comparing anomaly detection algorithms for outlier detection on toy datasets"
Comparing anomaly detection algorithms for outlier detection on toy datasets
data:image/s3,"s3://crabby-images/cd482/cd482371995919fb76881aa6268858a87b83a60d" alt="Demonstrating the different strategies of KBinsDiscretizer"
Demonstrating the different strategies of KBinsDiscretizer