sklearn.utils#

Various utilities to help with development.

Developer guide. See the Utilities for Developers section for further details.

Bunch

Container object exposing keys as attributes.

_safe_indexing

Return rows, items or columns of X using indices.

as_float_array

Convert an array-like to an array of floats.

assert_all_finite

Throw a ValueError if X contains NaN or infinity.

deprecated

Decorator to mark a function or class as deprecated.

estimator_html_repr

Build a HTML representation of an estimator.

gen_batches

Generator to create slices containing batch_size elements from 0 to n.

gen_even_slices

Generator to create n_packs evenly spaced slices going up to n.

indexable

Make arrays indexable for cross-validation.

murmurhash3_32

Compute the 32bit murmurhash3 of key at seed.

resample

Resample arrays or sparse matrices in a consistent way.

safe_mask

Return a mask which is safe to use on X.

safe_sqr

Element wise squaring of array-likes and sparse matrices.

shuffle

Shuffle arrays or sparse matrices in a consistent way.

Input and parameter validation#

Functions to validate input and parameters within scikit-learn estimators.

check_X_y

Input validation for standard estimators.

check_array

Input validation on an array, list, sparse matrix or similar.

check_consistent_length

Check that all arrays have consistent first dimensions.

check_random_state

Turn seed into a np.random.RandomState instance.

check_scalar

Validate scalar parameters type and value.

validation.check_is_fitted

Perform is_fitted validation for estimator.

validation.check_memory

Check that memory is joblib.Memory-like.

validation.check_symmetric

Make sure that array is 2D, square and symmetric.

validation.column_or_1d

Ravel column or 1d numpy array, else raises an error.

validation.has_fit_parameter

Check whether the estimator's fit method supports the given parameter.

Meta-estimators#

Utilities for meta-estimators.

metaestimators.available_if

An attribute that is available only if check returns a truthy value.

Weight handling based on class labels#

Utilities for handling weights based on class labels.

class_weight.compute_class_weight

Estimate class weights for unbalanced datasets.

class_weight.compute_sample_weight

Estimate sample weights by class for unbalanced datasets.

Dealing with multiclass target in classifiers#

Utilities to handle multiclass/multioutput target in classifiers.

multiclass.is_multilabel

Check if y is in a multilabel format.

multiclass.type_of_target

Determine the type of data indicated by the target.

multiclass.unique_labels

Extract an ordered array of unique labels.

Optimal mathematical operations#

Utilities to perform optimal mathematical operations in scikit-learn.

extmath.density

Compute density of a sparse vector.

extmath.fast_logdet

Compute logarithm of determinant of a square matrix.

extmath.randomized_range_finder

Compute an orthonormal matrix whose range approximates the range of A.

extmath.randomized_svd

Compute a truncated randomized SVD.

extmath.safe_sparse_dot

Dot product that handle the sparse matrix case correctly.

extmath.weighted_mode

Return an array of the weighted modal (most common) value in the passed array.

Working with sparse matrices and arrays#

A collection of utilities to work with sparse matrices and arrays.

sparsefuncs.incr_mean_variance_axis

Compute incremental mean and variance along an axis on a CSR or CSC matrix.

sparsefuncs.inplace_column_scale

Inplace column scaling of a CSC/CSR matrix.

sparsefuncs.inplace_csr_column_scale

Inplace column scaling of a CSR matrix.

sparsefuncs.inplace_row_scale

Inplace row scaling of a CSR or CSC matrix.

sparsefuncs.inplace_swap_column

Swap two columns of a CSC/CSR matrix in-place.

sparsefuncs.inplace_swap_row

Swap two rows of a CSC/CSR matrix in-place.

sparsefuncs.mean_variance_axis

Compute mean and variance along an axis on a CSR or CSC matrix.

Utilities to work with sparse matrices and arrays written in Cython.

sparsefuncs_fast.inplace_csr_row_normalize_l1

Normalize inplace the rows of a CSR matrix or array by their L1 norm.

sparsefuncs_fast.inplace_csr_row_normalize_l2

Normalize inplace the rows of a CSR matrix or array by their L2 norm.

Working with graphs#

Graph utilities and algorithms.

graph.single_source_shortest_path_length

Return the length of the shortest path from source to all reachable nodes.

Random sampling#

Utilities for random sampling.

random.sample_without_replacement

Sample integers without replacement.

Auxiliary functions that operate on arrays#

A small collection of auxiliary functions that operate on arrays.

arrayfuncs.min_pos

Find the minimum value of an array over positive values.

Metadata routing#

Utilities to route metadata within scikit-learn estimators.

User guide. See the Metadata Routing section for further details.

metadata_routing.MetadataRequest

Contains the metadata request info of a consumer.

metadata_routing.MetadataRouter

Stores and handles metadata routing for a router object.

metadata_routing.MethodMapping

Stores the mapping between caller and callee methods for a router.

metadata_routing.get_routing_for_object

Get a Metadata{Router, Request} instance from the given object.

metadata_routing.process_routing

Validate and route input parameters.

Discovering scikit-learn objects#

Utilities to discover scikit-learn objects.

discovery.all_displays

Get a list of all displays from sklearn.

API compatibility checkers#

Various utilities to check the compatibility of estimators with scikit-learn API.

estimator_checks.check_estimator

Check if estimator adheres to scikit-learn conventions.

estimator_checks.parametrize_with_checks

Pytest specific decorator for parametrizing estimator checks.

Parallel computing#

Customizations of joblib tools for scikit-learn usage.

parallel.Parallel

Tweak of joblib.Parallel that propagates the scikit-learn configuration.

parallel.delayed

Decorator used to capture the arguments of a function.