`sklearn.impute`.MissingIndicator¶

class sklearn.impute.MissingIndicator(*, missing_values=nan, features='missing-only', sparse='auto', error_on_new=True)[source]¶

Binary indicators for missing values.

Note that this component typically should not be used in a vanilla Pipeline consisting of transformers and a classifier, but rather could be added using a FeatureUnion or ColumnTransformer.

Read more in the User Guide.

New in version 0.20.

Parameters

missing_valuesint, float, string, np.nan or None, default=np.nan

The placeholder for the missing values. All occurrences of missing_values will be imputed. For pandas’ dataframes with nullable integer dtypes with missing values, missing_values should be set to np.nan, since pd.NA will be converted to np.nan.

features{‘missing-only’, ‘all’}, default=’missing-only’

Whether the imputer mask should represent all or a subset of features.

If ‘missing-only’ (default), the imputer mask will only represent features containing missing values during fit time.
If ‘all’, the imputer mask will represent all features.

sparsebool or ‘auto’, default=’auto’

Whether the imputer mask format should be sparse or dense.

If ‘auto’ (default), the imputer mask will be of same type as input.
If True, the imputer mask will be a sparse matrix.
If False, the imputer mask will be a numpy array.

error_on_newbool, default=True

If True, transform will raise an error when there are features with missing values in transform that have no missing values in fit. This is applicable only when features='missing-only'.

Attributes

features_ndarray, shape (n_missing_features,) or (n_features,): The features indices which will be returned when calling transform. They are computed during fit. For features='all', it is to range(n_features).

Examples

>>> import numpy as np
>>> from sklearn.impute import MissingIndicator
>>> X1 = np.array([[np.nan, 1, 3],
...                [4, 0, np.nan],
...                [8, 1, 0]])
>>> X2 = np.array([[5, 1, np.nan],
...                [np.nan, 2, 3],
...                [2, 4, 0]])
>>> indicator = MissingIndicator()
>>> indicator.fit(X1)
MissingIndicator()
>>> X2_tr = indicator.transform(X2)
>>> X2_tr
array([[False,  True],
       [ True, False],
       [False, False]])

Methods

`fit`(X[, y])	Fit the transformer on X.
`fit_transform`(X[, y])	Generate missing values indicator for X.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X)	Generate missing values indicator for X.

fit(X, y=None)[source]¶

Fit the transformer on X.

Parameters

X{array-like, sparse matrix}, shape (n_samples, n_features): Input data, where n_samples is the number of samples and n_features is the number of features.

Returns

selfobject: Returns self.

fit_transform(X, y=None)[source]¶

Generate missing values indicator for X.

Parameters

X{array-like, sparse matrix}, shape (n_samples, n_features): The input data to complete.

Returns

Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing): The missing indicator for input data. The data type of Xt will be boolean.

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

set_params(**params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfestimator instance: Estimator instance.

transform(X)[source]¶

Generate missing values indicator for X.

Parameters

X{array-like, sparse matrix}, shape (n_samples, n_features): The input data to complete.

Returns

Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing): The missing indicator for input data. The data type of Xt will be boolean.

sklearn.impute.MissingIndicator¶

`sklearn.impute`.MissingIndicator¶