sklearn.feature_extraction#

Feature extraction from raw data.

User guide. See the Feature extraction section for further details.

DictVectorizer

Transforms lists of feature-value mappings to vectors.

FeatureHasher

Implements feature hashing, aka the hashing trick.

From images#

Utilities to extract features from images.

image.PatchExtractor

Extracts patches from a collection of images.

image.extract_patches_2d

Reshape a 2D image into a collection of patches.

image.grid_to_graph

Graph of the pixel-to-pixel connections.

image.img_to_graph

Graph of the pixel-to-pixel gradient connections.

image.reconstruct_from_patches_2d

Reconstruct the image from all of its patches.

From text#

Utilities to build feature vectors from text documents.

text.CountVectorizer

Convert a collection of text documents to a matrix of token counts.

text.HashingVectorizer

Convert a collection of text documents to a matrix of token occurrences.

text.TfidfTransformer

Transform a count matrix to a normalized tf or tf-idf representation.

text.TfidfVectorizer

Convert a collection of raw documents to a matrix of TF-IDF features.