`sklearn.impute`.MissingIndicator¶

class sklearn.impute.MissingIndicator(missing_values=nan, features='missing-only', sparse='auto', error_on_new=True)[source]¶

Binary indicators for missing values.

Note that this component typically should not be used in a vanilla Pipeline consisting of transformers and a classifier, but rather could be added using a FeatureUnion or ColumnTransformer.

Read more in the User Guide.

Parameters

missing_valuesnumber, string, np.nan (default) or None

The placeholder for the missing values. All occurrences of missing_values will be indicated (True in the output array), the other values will be marked as False.

featuresstr, default=None

Whether the imputer mask should represent all or a subset of features.

If “missing-only” (default), the imputer mask will only represent features containing missing values during fit time.
If “all”, the imputer mask will represent all features.

sparseboolean or “auto”, default=None

Whether the imputer mask format should be sparse or dense.

If “auto” (default), the imputer mask will be of same type as input.
If True, the imputer mask will be a sparse matrix.
If False, the imputer mask will be a numpy array.

error_on_newboolean, default=None

If True (default), transform will raise an error when there are features with missing values in transform that have no missing values in fit. This is applicable only when features="missing-only".

Attributes

features_ndarray, shape (n_missing_features,) or (n_features,): The features indices which will be returned when calling transform. They are computed during fit. For features='all', it is to range(n_features).

Examples

>>> import numpy as np
>>> from sklearn.impute import MissingIndicator
>>> X1 = np.array([[np.nan, 1, 3],
...                [4, 0, np.nan],
...                [8, 1, 0]])
>>> X2 = np.array([[5, 1, np.nan],
...                [np.nan, 2, 3],
...                [2, 4, 0]])
>>> indicator = MissingIndicator()
>>> indicator.fit(X1)
MissingIndicator()
>>> X2_tr = indicator.transform(X2)
>>> X2_tr
array([[False,  True],
       [ True, False],
       [False, False]])

Methods

`fit`(self, X[, y])	Fit the transformer on X.
`fit_transform`(self, X[, y])	Generate missing values indicator for X.
`get_params`(self[, deep])	Get parameters for this estimator.
`set_params`(self, \\params)	Set the parameters of this estimator.
`transform`(self, X)	Generate missing values indicator for X.

__init__(self, missing_values=nan, features='missing-only', sparse='auto', error_on_new=True)[source]¶: Initialize self. See help(type(self)) for accurate signature.

fit(self, X, y=None)[source]¶

Fit the transformer on X.

Parameters

X{array-like, sparse matrix}, shape (n_samples, n_features): Input data, where n_samples is the number of samples and n_features is the number of features.

Returns

selfobject: Returns self.

fit_transform(self, X, y=None)[source]¶

Generate missing values indicator for X.

Parameters

X{array-like, sparse matrix}, shape (n_samples, n_features): The input data to complete.

Returns

Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing): The missing indicator for input data. The data type of Xt will be boolean.

get_params(self, deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

set_params(self, **params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfobject: Estimator instance.

transform(self, X)[source]¶

Generate missing values indicator for X.

Parameters

X{array-like, sparse matrix}, shape (n_samples, n_features): The input data to complete.

Returns

Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing): The missing indicator for input data. The data type of Xt will be boolean.

Examples using `sklearn.impute.MissingIndicator`¶

Imputing missing values before building an estimator¶

sklearn.impute.MissingIndicator¶

Examples using sklearn.impute.MissingIndicator¶

`sklearn.impute`.MissingIndicator¶

Examples using `sklearn.impute.MissingIndicator`¶