This is documentation for an old release of Scikit-learn (version 0.20). Try the latest stable release (version 1.6) or development (unstable) versions.

Note

Click here to download the full example code

Pipeline Anova SVM¶

Simple usage of Pipeline that runs successively a univariate feature selection with anova and then a C-SVM of the selected features.

Out:

              precision    recall  f1-score   support

           0       0.75      0.50      0.60         6
           1       0.60      1.00      0.75         6
           2       0.67      0.80      0.73         5
           3       1.00      0.62      0.77         8

   micro avg       0.72      0.72      0.72        25
   macro avg       0.75      0.73      0.71        25
weighted avg       0.78      0.72      0.72        25

from sklearn import svm
from sklearn.datasets import samples_generator
from sklearn.feature_selection import SelectKBest, f_regression
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

print(__doc__)

# import some data to play with
X, y = samples_generator.make_classification(
    n_features=20, n_informative=3, n_redundant=0, n_classes=4,
    n_clusters_per_class=2)

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

# ANOVA SVM-C
# 1) anova filter, take 3 best ranked features
anova_filter = SelectKBest(f_regression, k=3)
# 2) svm
clf = svm.SVC(kernel='linear')

anova_svm = make_pipeline(anova_filter, clf)
anova_svm.fit(X_train, y_train)
y_pred = anova_svm.predict(X_test)
print(classification_report(y_test, y_pred))

Total running time of the script: ( 0 minutes 0.073 seconds)

Gallery generated by Sphinx-Gallery