`sklearn.neighbors`.KNeighborsClassifier¶

class sklearn.neighbors.KNeighborsClassifier(n_neighbors=5, *, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=None, **kwargs)[source]¶

Classifier implementing the k-nearest neighbors vote.

See also

RadiusNeighborsClassifier
KNeighborsRegressor
RadiusNeighborsRegressor
NearestNeighbors

Notes

See Nearest Neighbors in the online documentation for a discussion of the choice of algorithm and leaf_size.

Warning

Regarding the Nearest Neighbors algorithms, if it is found that two neighbors, neighbor k+1 and k, have identical distances but different labels, the results will depend on the ordering of the training data.

https://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

Examples

>>> X = [[0], [1], [2], [3]]
>>> y = [0, 0, 1, 1]
>>> from sklearn.neighbors import KNeighborsClassifier
>>> neigh = KNeighborsClassifier(n_neighbors=3)
>>> neigh.fit(X, y)
KNeighborsClassifier(...)
>>> print(neigh.predict([[1.1]]))
[0]
>>> print(neigh.predict_proba([[0.9]]))
[[0.66666667 0.33333333]]

Methods

`fit`(X, y)	Fit the model using X as training data and y as target values
`get_params`([deep])	Get parameters for this estimator.
`kneighbors`([X, n_neighbors, return_distance])	Finds the K-neighbors of a point.
`kneighbors_graph`([X, n_neighbors, mode])	Computes the (weighted) graph of k-Neighbors for points in X
`predict`(X)	Predict the class labels for the provided data.
`predict_proba`(X)	Return probability estimates for the test data X.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of this estimator.

__init__(n_neighbors=5, *, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=None, **kwargs)[source]¶: Initialize self. See help(type(self)) for accurate signature.

fit(X, y)[source]¶

Fit the model using X as training data and y as target values

Parameters

X{array-like, sparse matrix, BallTree, KDTree}: Training data. If array or matrix, shape [n_samples, n_features], or [n_samples, n_samples] if metric=’precomputed’.
y{array-like, sparse matrix}: Target values of shape = [n_samples] or [n_samples, n_outputs]

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

kneighbors(X=None, n_neighbors=None, return_distance=True)[source]¶

Finds the K-neighbors of a point. Returns indices of and distances to the neighbors of each point.

Parameters

Xarray-like, shape (n_queries, n_features), or (n_queries, n_indexed) if metric == ‘precomputed’: The query point or points. If not provided, neighbors of each indexed point are returned. In this case, the query point is not considered its own neighbor.
n_neighborsint: Number of neighbors to get (default is the value passed to the constructor).
return_distanceboolean, optional. Defaults to True.: If False, distances will not be returned

Returns

neigh_distarray, shape (n_queries, n_neighbors): Array representing the lengths to points, only present if return_distance=True
neigh_indarray, shape (n_queries, n_neighbors): Indices of the nearest points in the population matrix.

Examples

In the following example, we construct a NearestNeighbors class from an array representing our data set and ask who’s the closest point to [1,1,1]

>>> samples = [[0., 0., 0.], [0., .5, 0.], [1., 1., .5]]
>>> from sklearn.neighbors import NearestNeighbors
>>> neigh = NearestNeighbors(n_neighbors=1)
>>> neigh.fit(samples)
NearestNeighbors(n_neighbors=1)
>>> print(neigh.kneighbors([[1., 1., 1.]]))
(array([[0.5]]), array([[2]]))

As you can see, it returns [[0.5]], and [[2]], which means that the element is at distance 0.5 and is the third element of samples (indexes start at 0). You can also query for multiple points:

>>> X = [[0., 1., 0.], [1., 0., 1.]]
>>> neigh.kneighbors(X, return_distance=False)
array([[1],
       [2]]...)

kneighbors_graph(X=None, n_neighbors=None, mode='connectivity')[source]¶

Computes the (weighted) graph of k-Neighbors for points in X

Parameters

Xarray-like, shape (n_queries, n_features), or (n_queries, n_indexed) if metric == ‘precomputed’: The query point or points. If not provided, neighbors of each indexed point are returned. In this case, the query point is not considered its own neighbor.
n_neighborsint: Number of neighbors for each sample. (default is value passed to the constructor).
mode{‘connectivity’, ‘distance’}, optional: Type of returned matrix: ‘connectivity’ will return the connectivity matrix with ones and zeros, in ‘distance’ the edges are Euclidean distance between points.

Returns

Asparse graph in CSR format, shape = [n_queries, n_samples_fit]: n_samples_fit is the number of samples in the fitted data A[i, j] is assigned the weight of edge that connects i to j.

Examples using `sklearn.neighbors.KNeighborsClassifier`¶

Classifier comparison¶

Plot the decision boundaries of a VotingClassifier¶

Nearest Neighbors Classification¶

Caching nearest neighbors¶

Comparing Nearest Neighbors with and without Neighborhood Components Analysis¶

Dimensionality Reduction with Neighborhood Components Analysis¶

Digits Classification Exercise¶

Classification of text documents using sparse features¶

sklearn.neighbors.KNeighborsClassifier¶

Examples using sklearn.neighbors.KNeighborsClassifier¶

`sklearn.neighbors`.KNeighborsClassifier¶

Examples using `sklearn.neighbors.KNeighborsClassifier`¶