Loading [Contrib]/a11y/accessibility-menu.js

Fork me on GitHub

Previous
sklearn.prepr... sklearn.preprocessing.robust_scale

Next
sklearn.prepr... sklearn.preprocessing.power_transform

Up
API Reference API Reference

scikit-learn v0.20.4
Other versions

Please cite us if you use the software.

sklearn.preprocessing.scale
- Examples using sklearn.preprocessing.scale

This is documentation for an old release of Scikit-learn (version 0.20). Try the latest stable release (version 1.6) or development (unstable) versions.

`sklearn.preprocessing`.scale¶

sklearn.preprocessing.scale(X, axis=0, with_mean=True, with_std=True, copy=True)[source]¶

Standardize a dataset along any axis

Center to the mean and component wise scale to unit variance.

Read more in the User Guide.

Parameters:

X : {array-like, sparse matrix}: The data to center and scale.
axis : int (0 by default): axis used to compute the means and standard deviations along. If 0, independently standardize each feature, otherwise (if 1) standardize each sample.
with_mean : boolean, True by default: If True, center the data before scaling.
with_std : boolean, True by default: If True, scale the data to unit variance (or equivalently, unit standard deviation).
copy : boolean, optional, default True: set to False to perform inplace row normalization and avoid a copy (if the input is already a numpy array or a scipy.sparse CSC matrix and if axis is 1).

See also

StandardScaler: Performs scaling to unit variance using the``Transformer`` API (e.g. as part of a preprocessing sklearn.pipeline.Pipeline).

Notes

This implementation will refuse to center scipy.sparse matrices since it would make them non-sparse and would potentially crash the program with memory exhaustion problems.

Instead the caller is expected to either set explicitly with_mean=False (in that case, only variance scaling will be performed on the features of the CSC matrix) or to call X.toarray() if he/she expects the materialized dense array to fit in memory.

To avoid memory copy the caller should pass a CSC matrix.

NaNs are treated as missing values: disregarded to compute the statistics, and maintained during the data transformation.

We use a biased estimator for the standard deviation, equivalent to numpy.std(x, ddof=0). Note that the choice of ddof is unlikely to affect model performance.

For a comparison of the different scalers, transformers, and normalizers, see examples/preprocessing/plot_all_scaling.py.

Examples using `sklearn.preprocessing.scale`¶

../../_images/sphx_glr_plot_kmeans_digits_thumb.png

A demo of K-Means clustering on the handwritten digits data

Next