Fork me on GitHub


sklearn.cluster.mean_shift(X, bandwidth=None, seeds=None, bin_seeding=False, min_bin_freq=1, cluster_all=True, max_iterations=300)

Perform MeanShift Clustering of data using a flat kernel

Seed using a binning technique for scalability.

Parameters :

X : array-like shape=[n_samples, n_features]

Input data.

bandwidth : float, optional

Kernel bandwidth. If bandwidth is not defined, it is set using a heuristic given by the median of all pairwise distances.

seeds : array [n_seeds, n_features]

Point used as initial kernel locations.

bin_seeding : boolean

If true, initial kernel locations are not locations of all points, but rather the location of the discretized version of points, where points are binned onto a grid whose coarseness corresponds to the bandwidth. Setting this option to True will speed up the algorithm because fewer seeds will be initialized. default value: False Ignored if seeds argument is not None.

min_bin_freq : int, optional

To speed up the algorithm, accept only those bins with at least min_bin_freq points as seeds. If not defined, set to 1.

Returns :

cluster_centers : array [n_clusters, n_features]

Coordinates of cluster centers.

labels : array [n_samples]

Cluster labels for each point.


See examples/cluster/ for an example.