sklearn.cluster.estimate_bandwidth

sklearn.cluster.estimate_bandwidth(X, quantile=0.3, n_samples=None, random_state=0, n_jobs=1)[source]

Estimate the bandwidth to use with the mean-shift algorithm.

That this function takes time at least quadratic in n_samples. For large datasets, it’s wise to set that parameter to a small value.

Parameters:

X : array-like, shape=[n_samples, n_features]

Input points.

quantile : float, default 0.3

should be between [0, 1] 0.5 means that the median of all pairwise distances is used.

n_samples : int, optional

The number of samples to use. If not given, all samples are used.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

n_jobs : int, optional (default = 1)

The number of parallel jobs to run for neighbors search. If -1, then the number of jobs is set to the number of CPU cores.

Returns:

bandwidth : float

The bandwidth parameter.