.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/ensemble/plot_feature_transformation.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_ensemble_plot_feature_transformation.py>`
        to download the full example code or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_ensemble_plot_feature_transformation.py:


===============================================
Feature transformations with ensembles of trees
===============================================

Transform your features into a higher dimensional, sparse space. Then train a
linear model on these features.

First fit an ensemble of trees (totally random trees, a random forest, or
gradient boosted trees) on the training set. Then each leaf of each tree in the
ensemble is assigned a fixed arbitrary feature index in a new feature space.
These leaf indices are then encoded in a one-hot fashion.

Each sample goes through the decisions of each tree of the ensemble and ends up
in one leaf per tree. The sample is encoded by setting feature values for these
leaves to 1 and the other feature values to 0.

The resulting transformer has then learned a supervised, sparse,
high-dimensional categorical embedding of the data.

.. GENERATED FROM PYTHON SOURCE LINES 21-31

.. code-block:: default


    # Author: Tim Head <betatim@gmail.com>
    #
    # License: BSD 3 clause

    print(__doc__)

    from sklearn import set_config
    set_config(display='diagram')


.. GENERATED FROM PYTHON SOURCE LINES 32-41

First, we will create a large dataset and split it into three sets:

- a set to train the ensemble methods which are later used to as a feature
  engineering transformer;
- a set to train the linear model;
- a set to test the linear model.

It is important to split the data in such way to avoid overfitting by leaking
data.

.. GENERATED FROM PYTHON SOURCE LINES 41-53

.. code-block:: default


    from sklearn.datasets import make_classification
    from sklearn.model_selection import train_test_split

    X, y = make_classification(n_samples=80000, random_state=10)

    X_full_train, X_test, y_full_train, y_test = train_test_split(
        X, y, test_size=0.5, random_state=10)
    X_train_ensemble, X_train_linear, y_train_ensemble, y_train_linear = \
        train_test_split(X_full_train, y_full_train, test_size=0.5,
                         random_state=10)


.. GENERATED FROM PYTHON SOURCE LINES 54-56

For each of the ensemble methods, we will use 10 estimators and a maximum
depth of 3 levels.

.. GENERATED FROM PYTHON SOURCE LINES 56-60

.. code-block:: default


    n_estimators = 10
    max_depth = 3


.. GENERATED FROM PYTHON SOURCE LINES 61-63

First, we will start by training the random forest and gradient boosting on
the separated training set

.. GENERATED FROM PYTHON SOURCE LINES 63-74

.. code-block:: default


    from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier

    random_forest = RandomForestClassifier(
        n_estimators=n_estimators, max_depth=max_depth, random_state=10)
    random_forest.fit(X_train_ensemble, y_train_ensemble)

    gradient_boosting = GradientBoostingClassifier(
        n_estimators=n_estimators, max_depth=max_depth, random_state=10)
    _ = gradient_boosting.fit(X_train_ensemble, y_train_ensemble)


.. GENERATED FROM PYTHON SOURCE LINES 75-77

The :class:`~sklearn.ensemble.RandomTreesEmbedding` is an unsupervised method
and thus does not required to be trained independently.

.. GENERATED FROM PYTHON SOURCE LINES 77-83

.. code-block:: default


    from sklearn.ensemble import RandomTreesEmbedding

    random_tree_embedding = RandomTreesEmbedding(
        n_estimators=n_estimators, max_depth=max_depth, random_state=0)


.. GENERATED FROM PYTHON SOURCE LINES 84-89

Now, we will create three pipelines that will use the above embedding as
a preprocessing stage.

The random trees embedding can be directly pipelined with the logistic
regression because it is a standard scikit-learn transformer.

.. GENERATED FROM PYTHON SOURCE LINES 89-97

.. code-block:: default


    from sklearn.linear_model import LogisticRegression
    from sklearn.pipeline import make_pipeline

    rt_model = make_pipeline(
        random_tree_embedding, LogisticRegression(max_iter=1000))
    rt_model.fit(X_train_linear, y_train_linear)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc {color: black;background-color: white;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc pre{padding: 0;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-toggleable {background-color: white;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.2em 0.3em;box-sizing: border-box;text-align: center;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-estimator {font-family: monospace;background-color: #f0f8ff;margin: 0.25em 0.25em;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-estimator:hover {background-color: #d4ebff;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-parallel-item::after {content: "";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-serial::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 2em;bottom: 0;left: 50%;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-item {z-index: 1;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-parallel-item {display: flex;flex-direction: column;position: relative;background-color: white;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-parallel-item:only-child::after {width: 0;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0.2em;box-sizing: border-box;padding-bottom: 0.1em;background-color: white;position: relative;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-label label {font-family: monospace;font-weight: bold;background-color: white;display: inline-block;line-height: 1.2em;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-label-container {position: relative;z-index: 2;text-align: center;}#sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc div.sk-container {display: inline-block;position: relative;}</style><div id="sk-d35a9670-5f05-44bd-ac6c-eda316fc94dc" class"sk-top-container"><div class="sk-container"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="11f40a69-3e85-48b0-a265-be77d435012f" type="checkbox" ><label class="sk-toggleable__label" for="11f40a69-3e85-48b0-a265-be77d435012f">Pipeline</label><div class="sk-toggleable__content"><pre>Pipeline(steps=[('randomtreesembedding',
                     RandomTreesEmbedding(max_depth=3, n_estimators=10,
                                          random_state=0)),
                    ('logisticregression', LogisticRegression(max_iter=1000))])</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="be9d4ab3-1f94-4f4a-be91-0e8f5aeba12d" type="checkbox" ><label class="sk-toggleable__label" for="be9d4ab3-1f94-4f4a-be91-0e8f5aeba12d">RandomTreesEmbedding</label><div class="sk-toggleable__content"><pre>RandomTreesEmbedding(max_depth=3, n_estimators=10, random_state=0)</pre></div></div></div><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="6912c510-fa3e-40aa-87f0-de03ddb83339" type="checkbox" ><label class="sk-toggleable__label" for="6912c510-fa3e-40aa-87f0-de03ddb83339">LogisticRegression</label><div class="sk-toggleable__content"><pre>LogisticRegression(max_iter=1000)</pre></div></div></div></div></div></div></div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 98-102

Then, we can pipeline random forest or gradient boosting with a logistic
regression. However, the feature transformation will happen by calling the
method `apply`. The pipeline in scikit-learn expects a call to `transform`.
Therefore, we wrapped the call to `apply` within a `FunctionTransformer`.

.. GENERATED FROM PYTHON SOURCE LINES 102-120

.. code-block:: default


    from sklearn.preprocessing import FunctionTransformer
    from sklearn.preprocessing import OneHotEncoder


    def rf_apply(X, model):
        return model.apply(X)


    rf_leaves_yielder = FunctionTransformer(
        rf_apply, kw_args={"model": random_forest})

    rf_model = make_pipeline(
        rf_leaves_yielder, OneHotEncoder(handle_unknown="ignore"),
        LogisticRegression(max_iter=1000))
    rf_model.fit(X_train_linear, y_train_linear)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 {color: black;background-color: white;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 pre{padding: 0;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-toggleable {background-color: white;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.2em 0.3em;box-sizing: border-box;text-align: center;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;margin: 0.25em 0.25em;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-estimator:hover {background-color: #d4ebff;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-parallel-item::after {content: "";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-serial::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 2em;bottom: 0;left: 50%;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-item {z-index: 1;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-parallel-item {display: flex;flex-direction: column;position: relative;background-color: white;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-parallel-item:only-child::after {width: 0;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0.2em;box-sizing: border-box;padding-bottom: 0.1em;background-color: white;position: relative;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-label label {font-family: monospace;font-weight: bold;background-color: white;display: inline-block;line-height: 1.2em;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-label-container {position: relative;z-index: 2;text-align: center;}#sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258 div.sk-container {display: inline-block;position: relative;}</style><div id="sk-4b8d60ba-900b-44a1-85a2-d2a1e3c19258" class"sk-top-container"><div class="sk-container"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="26fd096f-2232-43c7-8e26-43fa0fddd625" type="checkbox" ><label class="sk-toggleable__label" for="26fd096f-2232-43c7-8e26-43fa0fddd625">Pipeline</label><div class="sk-toggleable__content"><pre>Pipeline(steps=[('functiontransformer',
                     FunctionTransformer(func=<function rf_apply at 0x7efbff7741f0>,
                                         kw_args={'model': RandomForestClassifier(max_depth=3,
                                                                                  n_estimators=10,
                                                                                  random_state=10)})),
                    ('onehotencoder', OneHotEncoder(handle_unknown='ignore')),
                    ('logisticregression', LogisticRegression(max_iter=1000))])</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="287181e4-c531-43e3-a518-8d0f34820800" type="checkbox" ><label class="sk-toggleable__label" for="287181e4-c531-43e3-a518-8d0f34820800">FunctionTransformer</label><div class="sk-toggleable__content"><pre>FunctionTransformer(func=<function rf_apply at 0x7efbff7741f0>,
                        kw_args={'model': RandomForestClassifier(max_depth=3,
                                                                 n_estimators=10,
                                                                 random_state=10)})</pre></div></div></div><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="d01996cd-fe2d-4635-97f0-0dafcfe39f4c" type="checkbox" ><label class="sk-toggleable__label" for="d01996cd-fe2d-4635-97f0-0dafcfe39f4c">OneHotEncoder</label><div class="sk-toggleable__content"><pre>OneHotEncoder(handle_unknown='ignore')</pre></div></div></div><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="a79c4032-fd71-4b01-8792-c3b85294300c" type="checkbox" ><label class="sk-toggleable__label" for="a79c4032-fd71-4b01-8792-c3b85294300c">LogisticRegression</label><div class="sk-toggleable__content"><pre>LogisticRegression(max_iter=1000)</pre></div></div></div></div></div></div></div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 121-133

.. code-block:: default

    def gbdt_apply(X, model):
        return model.apply(X)[:, :, 0]


    gbdt_leaves_yielder = FunctionTransformer(
        gbdt_apply, kw_args={"model": gradient_boosting})

    gbdt_model = make_pipeline(
        gbdt_leaves_yielder, OneHotEncoder(handle_unknown="ignore"),
        LogisticRegression(max_iter=1000))
    gbdt_model.fit(X_train_linear, y_train_linear)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b {color: black;background-color: white;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b pre{padding: 0;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-toggleable {background-color: white;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.2em 0.3em;box-sizing: border-box;text-align: center;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-estimator {font-family: monospace;background-color: #f0f8ff;margin: 0.25em 0.25em;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-estimator:hover {background-color: #d4ebff;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-parallel-item::after {content: "";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-serial::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 2em;bottom: 0;left: 50%;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-item {z-index: 1;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-parallel-item {display: flex;flex-direction: column;position: relative;background-color: white;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-parallel-item:only-child::after {width: 0;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0.2em;box-sizing: border-box;padding-bottom: 0.1em;background-color: white;position: relative;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-label label {font-family: monospace;font-weight: bold;background-color: white;display: inline-block;line-height: 1.2em;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-label-container {position: relative;z-index: 2;text-align: center;}#sk-1e77715c-ce78-4ae4-9cf9-b408affc519b div.sk-container {display: inline-block;position: relative;}</style><div id="sk-1e77715c-ce78-4ae4-9cf9-b408affc519b" class"sk-top-container"><div class="sk-container"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="7786f956-f807-4fd3-a8db-030cee156f2b" type="checkbox" ><label class="sk-toggleable__label" for="7786f956-f807-4fd3-a8db-030cee156f2b">Pipeline</label><div class="sk-toggleable__content"><pre>Pipeline(steps=[('functiontransformer',
                     FunctionTransformer(func=<function gbdt_apply at 0x7efbff9be310>,
                                         kw_args={'model': GradientBoostingClassifier(n_estimators=10,
                                                                                      random_state=10)})),
                    ('onehotencoder', OneHotEncoder(handle_unknown='ignore')),
                    ('logisticregression', LogisticRegression(max_iter=1000))])</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="3be371cb-2a98-441e-9bbb-bfa5487b74d2" type="checkbox" ><label class="sk-toggleable__label" for="3be371cb-2a98-441e-9bbb-bfa5487b74d2">FunctionTransformer</label><div class="sk-toggleable__content"><pre>FunctionTransformer(func=<function gbdt_apply at 0x7efbff9be310>,
                        kw_args={'model': GradientBoostingClassifier(n_estimators=10,
                                                                     random_state=10)})</pre></div></div></div><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="84c93798-4a89-4631-af55-7884bca243b9" type="checkbox" ><label class="sk-toggleable__label" for="84c93798-4a89-4631-af55-7884bca243b9">OneHotEncoder</label><div class="sk-toggleable__content"><pre>OneHotEncoder(handle_unknown='ignore')</pre></div></div></div><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="31900477-5839-4db4-b387-edda52b370de" type="checkbox" ><label class="sk-toggleable__label" for="31900477-5839-4db4-b387-edda52b370de">LogisticRegression</label><div class="sk-toggleable__content"><pre>LogisticRegression(max_iter=1000)</pre></div></div></div></div></div></div></div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 134-135

We can finally show the different ROC curves for all the models.

.. GENERATED FROM PYTHON SOURCE LINES 135-155

.. code-block:: default


    import matplotlib.pyplot as plt
    from sklearn.metrics import plot_roc_curve

    fig, ax = plt.subplots()

    models = [
        ("RT embedding -> LR", rt_model),
        ("RF", random_forest),
        ("RF embedding -> LR", rf_model),
        ("GBDT", gradient_boosting),
        ("GBDT embedding -> LR", gbdt_model),
    ]

    model_displays = {}
    for name, pipeline in models:
        model_displays[name] = plot_roc_curve(
            pipeline, X_test, y_test, ax=ax, name=name)
    _ = ax.set_title('ROC curve')


.. image:: /auto_examples/ensemble/images/sphx_glr_plot_feature_transformation_001.png
    :alt: ROC curve
    :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 156-163

.. code-block:: default

    fig, ax = plt.subplots()
    for name, pipeline in models:
        model_displays[name].plot(ax=ax)

    ax.set_xlim(0, 0.2)
    ax.set_ylim(0.8, 1)
    _ = ax.set_title('ROC curve (zoomed in at top left)')


.. image:: /auto_examples/ensemble/images/sphx_glr_plot_feature_transformation_002.png
    :alt: ROC curve (zoomed in at top left)
    :class: sphx-glr-single-img


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  4.190 seconds)


.. _sphx_glr_download_auto_examples_ensemble_plot_feature_transformation.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: binder-badge

    .. image:: images/binder_badge_logo.svg
      :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/0.24.X?urlpath=lab/tree/notebooks/auto_examples/ensemble/plot_feature_transformation.ipynb
      :alt: Launch binder
      :width: 150 px


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_feature_transformation.py <plot_feature_transformation.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_feature_transformation.ipynb <plot_feature_transformation.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_