.. only:: html
.. note::
:class: sphx-glr-download-link-note
Click :ref:`here ` to download the full example code or to run this example in your browser via Binder
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_auto_examples_compose_plot_feature_union.py:
=================================================
Concatenating multiple feature extraction methods
=================================================
In many real-world examples, there are many ways to extract features from a
dataset. Often it is beneficial to combine several methods to obtain good
performance. This example shows how to use ``FeatureUnion`` to combine
features obtained by PCA and univariate selection.
Combining features using this transformer has the benefit that it allows
cross validation and grid searches over the whole process.
The combination used in this example is not particularly helpful on this
dataset and is only used to illustrate the usage of FeatureUnion.
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
Combined space has 3 features
Fitting 5 folds for each of 18 candidates, totalling 90 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1, score=0.933, total= 0.0s
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1, score=0.933, total= 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1, score=0.867, total= 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1, score=0.933, total= 0.0s
[Parallel(n_jobs=1)]: Done 4 out of 4 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=0.1, score=1.000, total= 0.0s
[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1, score=0.900, total= 0.0s
[Parallel(n_jobs=1)]: Done 6 out of 6 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1, score=1.000, total= 0.0s
[Parallel(n_jobs=1)]: Done 7 out of 7 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1, score=0.867, total= 0.0s
[Parallel(n_jobs=1)]: Done 8 out of 8 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1, score=0.933, total= 0.0s
[Parallel(n_jobs=1)]: Done 9 out of 9 | elapsed: 0.0s remaining: 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10, score=0.900, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=1, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1, score=0.967, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=1, features__univ_select__k=2, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1, score=0.867, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10, score=0.900, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=1, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10, score=0.900, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=2, features__univ_select__k=2, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1, score=0.933, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10, score=0.933, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=1, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1, score=0.933, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=0.1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=1, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10, score=1.000, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10, score=0.900, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10, score=0.967, total= 0.0s
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10
[CV] features__pca__n_components=3, features__univ_select__k=2, svm__C=10, score=1.000, total= 0.0s
[Parallel(n_jobs=1)]: Done 90 out of 90 | elapsed: 0.4s finished
Pipeline(steps=[('features',
FeatureUnion(transformer_list=[('pca', PCA(n_components=3)),
('univ_select',
SelectKBest(k=1))])),
('svm', SVC(C=10, kernel='linear'))])
|
.. code-block:: default
# Author: Andreas Mueller
#
# License: BSD 3 clause
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
from sklearn.feature_selection import SelectKBest
iris = load_iris()
X, y = iris.data, iris.target
# This dataset is way too high-dimensional. Better do PCA:
pca = PCA(n_components=2)
# Maybe some original features where good, too?
selection = SelectKBest(k=1)
# Build estimator from PCA and Univariate selection:
combined_features = FeatureUnion([("pca", pca), ("univ_select", selection)])
# Use combined features to transform dataset:
X_features = combined_features.fit(X, y).transform(X)
print("Combined space has", X_features.shape[1], "features")
svm = SVC(kernel="linear")
# Do grid search over k, n_components and C:
pipeline = Pipeline([("features", combined_features), ("svm", svm)])
param_grid = dict(features__pca__n_components=[1, 2, 3],
features__univ_select__k=[1, 2],
svm__C=[0.1, 1, 10])
grid_search = GridSearchCV(pipeline, param_grid=param_grid, verbose=10)
grid_search.fit(X, y)
print(grid_search.best_estimator_)
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 0 minutes 0.386 seconds)
.. _sphx_glr_download_auto_examples_compose_plot_feature_union.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: binder-badge
.. image:: https://mybinder.org/badge_logo.svg
:target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/0.23.X?urlpath=lab/tree/notebooks/auto_examples/compose/plot_feature_union.ipynb
:width: 150 px
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: plot_feature_union.py `
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: plot_feature_union.ipynb `
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery `_