# 4.4. Unsupervised dimensionality reduction¶

If your number of features is high, it may be useful to reduce it with an
unsupervised step prior to supervised steps. Many of the
*Unsupervised learning* methods implement a `transform` method that
can be used to reduce the dimensionality. Below we discuss two specific
example of this pattern that are heavily used.

**Pipelining**

The unsupervised data reduction and the supervised estimator can be
chained in one step. See *Pipeline: chaining estimators*.

## 4.4.1. PCA: principal component analysis¶

`decomposition.PCA` looks for a combination of features that
capture well the variance of the original features. See *Decomposing signals in components (matrix factorization problems)*.

## 4.4.2. Random projections¶

The module: `random_projection` provides several tools for data
reduction by random projections. See the relevant section of the
documentation: *Random Projection*.

## 4.4.3. Feature agglomeration¶

`cluster.FeatureAgglomeration` applies
*Hierarchical clustering* to group together features that behave
similarly.

**Feature scaling**

Note that if features have very different scaling or statistical
properties, `cluster.FeatureAgglomeration` may not be able to
capture the links between related features. Using a
`preprocessing.StandardScaler` can be useful in these settings.