validate_data#
- sklearn.utils.validation.validate_data(_estimator, /, X='no_validation', y='no_validation', reset=True, validate_separately=False, skip_check_array=False, **check_params)[source]#
Validate input data and set or check feature names and counts of the input.
This helper function should be used in an estimator that requires input validation. This mutates the estimator and sets the
n_features_in_
andfeature_names_in_
attributes ifreset=True
.Added in version 1.6.
- Parameters:
- _estimatorestimator instance
The estimator to validate the input for.
- X{array-like, sparse matrix, dataframe} of shape (n_samples, n_features), default=’no validation’
The input samples. If
'no_validation'
, no validation is performed onX
. This is useful for meta-estimator which can delegate input validation to their underlying estimator(s). In that casey
must be passed and the only acceptedcheck_params
aremulti_output
andy_numeric
.- yarray-like of shape (n_samples,), default=’no_validation’
The targets.
If
None
,check_array
is called onX
. If the estimator’srequires_y
tag is True, then an error will be raised.If
'no_validation'
,check_array
is called onX
and the estimator’srequires_y
tag is ignored. This is a default placeholder and is never meant to be explicitly set. In that caseX
must be passed.Otherwise, only
y
with_check_y
or bothX
andy
are checked with eithercheck_array
orcheck_X_y
depending onvalidate_separately
.
- resetbool, default=True
Whether to reset the
n_features_in_
attribute. If False, the input will be checked for consistency with data provided when reset was last True.Note
It is recommended to call
reset=True
infit
and in the first call topartial_fit
. All other methods that validateX
should setreset=False
.- validate_separatelyFalse or tuple of dicts, default=False
Only used if
y
is notNone
. IfFalse
, callcheck_X_y
. Else, it must be a tuple of kwargs to be used for callingcheck_array
onX
andy
respectively.estimator=self
is automatically added to these dicts to generate more informative error message in case of invalid input data.- skip_check_arraybool, default=False
If
True
,X
andy
are unchanged and onlyfeature_names_in_
andn_features_in_
are checked. Otherwise,check_array
is called onX
andy
.- **check_paramskwargs
Parameters passed to
check_array
orcheck_X_y
. Ignored if validate_separately is not False.estimator=self
is automatically added to these params to generate more informative error message in case of invalid input data.
- Returns:
- out{ndarray, sparse matrix} or tuple of these
The validated input. A tuple is returned if both
X
andy
are validated.
Gallery examples#
Release Highlights for scikit-learn 1.6