This is documentation for an old release of Scikit-learn (version 0.20). Try the latest stable release (version 1.6) or development (unstable) versions.
Note
Click here to download the full example code
Pipeline Anova SVM¶
Simple usage of Pipeline that runs successively a univariate feature selection with anova and then a C-SVM of the selected features.
Out:
precision recall f1-score support
0 0.75 0.50 0.60 6
1 0.60 1.00 0.75 6
2 0.67 0.80 0.73 5
3 1.00 0.62 0.77 8
micro avg 0.72 0.72 0.72 25
macro avg 0.75 0.73 0.71 25
weighted avg 0.78 0.72 0.72 25
from sklearn import svm
from sklearn.datasets import samples_generator
from sklearn.feature_selection import SelectKBest, f_regression
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
print(__doc__)
# import some data to play with
X, y = samples_generator.make_classification(
n_features=20, n_informative=3, n_redundant=0, n_classes=4,
n_clusters_per_class=2)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
# ANOVA SVM-C
# 1) anova filter, take 3 best ranked features
anova_filter = SelectKBest(f_regression, k=3)
# 2) svm
clf = svm.SVC(kernel='linear')
anova_svm = make_pipeline(anova_filter, clf)
anova_svm.fit(X_train, y_train)
y_pred = anova_svm.predict(X_test)
print(classification_report(y_test, y_pred))
Total running time of the script: ( 0 minutes 0.073 seconds)