sklearn.cluster.cluster_optics_xi

sklearn.cluster.cluster_optics_xi(*, reachability, predecessor, ordering, min_samples, min_cluster_size=None, xi=0.05, predecessor_correction=True)[source]

Automatically extract clusters according to the Xi-steep method.

Parameters:
reachabilityndarray of shape (n_samples,)

Reachability distances calculated by OPTICS (reachability_).

predecessorndarray of shape (n_samples,)

Predecessors calculated by OPTICS.

orderingndarray of shape (n_samples,)

OPTICS ordered point indices (ordering_).

min_samplesint > 1 or float between 0 and 1

The same as the min_samples given to OPTICS. Up and down steep regions can’t have more then min_samples consecutive non-steep points. Expressed as an absolute number or a fraction of the number of samples (rounded to be at least 2).

min_cluster_sizeint > 1 or float between 0 and 1, default=None

Minimum number of samples in an OPTICS cluster, expressed as an absolute number or a fraction of the number of samples (rounded to be at least 2). If None, the value of min_samples is used instead.

xifloat between 0 and 1, default=0.05

Determines the minimum steepness on the reachability plot that constitutes a cluster boundary. For example, an upwards point in the reachability plot is defined by the ratio from one point to its successor being at most 1-xi.

predecessor_correctionbool, default=True

Correct clusters based on the calculated predecessors.

Returns:
labelsndarray of shape (n_samples,)

The labels assigned to samples. Points which are not included in any cluster are labeled as -1.

clustersndarray of shape (n_clusters, 2)

The list of clusters in the form of [start, end] in each row, with all indices inclusive. The clusters are ordered according to (end, -start) (ascending) so that larger clusters encompassing smaller clusters come after such nested smaller clusters. Since labels does not reflect the hierarchy, usually len(clusters) > np.unique(labels).

Examples

>>> import numpy as np
>>> from sklearn.cluster import cluster_optics_xi, compute_optics_graph
>>> X = np.array([[1, 2], [2, 5], [3, 6],
...               [8, 7], [8, 8], [7, 3]])
>>> ordering, core_distances, reachability, predecessor = compute_optics_graph(
...     X,
...     min_samples=2,
...     max_eps=np.inf,
...     metric="minkowski",
...     p=2,
...     metric_params=None,
...     algorithm="auto",
...     leaf_size=30,
...     n_jobs=None
... )
>>> min_samples = 2
>>> labels, clusters = cluster_optics_xi(
...     reachability=reachability,
...     predecessor=predecessor,
...     ordering=ordering,
...     min_samples=min_samples,
... )
>>> labels
array([0, 0, 0, 1, 1, 1])
>>> clusters
array([[0, 2],
       [3, 5],
       [0, 5]])