.. _sphx_glr_auto_examples_text: .. _text_examples: Working with text documents ---------------------------- Examples concerning the :mod:`sklearn.feature_extraction.text` module. .. raw:: html <div class="sphx-glr-thumbnails"> .. raw:: html <div class="sphx-glr-thumbcontainer" tooltip="This is an example showing how scikit-learn can be used to classify documents by topics using a..."> .. only:: html .. image:: /auto_examples/text/images/thumb/sphx_glr_plot_document_classification_20newsgroups_thumb.png :alt: Classification of text documents using sparse features :ref:`sphx_glr_auto_examples_text_plot_document_classification_20newsgroups.py` .. raw:: html <div class="sphx-glr-thumbnail-title">Classification of text documents using sparse features</div> </div> .. raw:: html <div class="sphx-glr-thumbcontainer" tooltip="This is an example showing how the scikit-learn API can be used to cluster documents by topics ..."> .. only:: html .. image:: /auto_examples/text/images/thumb/sphx_glr_plot_document_clustering_thumb.png :alt: Clustering text documents using k-means :ref:`sphx_glr_auto_examples_text_plot_document_clustering.py` .. raw:: html <div class="sphx-glr-thumbnail-title">Clustering text documents using k-means</div> </div> .. raw:: html <div class="sphx-glr-thumbcontainer" tooltip="In this example we illustrate text vectorization, which is the process of representing non-nume..."> .. only:: html .. image:: /auto_examples/text/images/thumb/sphx_glr_plot_hashing_vs_dict_vectorizer_thumb.png :alt: FeatureHasher and DictVectorizer Comparison :ref:`sphx_glr_auto_examples_text_plot_hashing_vs_dict_vectorizer.py` .. raw:: html <div class="sphx-glr-thumbnail-title">FeatureHasher and DictVectorizer Comparison</div> </div> .. raw:: html </div> .. toctree:: :hidden: /auto_examples/text/plot_document_classification_20newsgroups /auto_examples/text/plot_document_clustering /auto_examples/text/plot_hashing_vs_dict_vectorizer