.. _example_ensemble_plot_random_forest_embedding.py:


=========================================================
Hashing feature transformation using Totally Random Trees
=========================================================

RandomTreesEmbedding provides a way to map data to a
very high-dimensional, sparse representation, which might
be beneficial for classification.
The mapping is completely unsupervised and very efficient.

This example visualizes the partitions given by several
trees and shows how the transformation can also be used for
non-linear dimensionality reduction or non-linear classification.

Points that are neighboring often share the same leaf of a tree and therefore
share large parts of their hashed representation. This allows to
separate two concentric circles simply based on the principal components of the
transformed data.

In high-dimensional spaces, linear classifiers often achieve
excellent accuracy. For sparse binary data, BernoulliNB
is particularly well-suited. The bottom row compares the
decision boundary obtained by BernoulliNB in the transformed
space with an ExtraTreesClassifier forests learned on the
original data.


.. image:: images/plot_random_forest_embedding_001.png
    :align: center


**Python source code:** :download:`plot_random_forest_embedding.py <plot_random_forest_embedding.py>`

.. literalinclude:: plot_random_forest_embedding.py
    :lines: 27-

**Total running time of the example:**  0.60 seconds
( 0 minutes  0.60 seconds)