.. _example_applications_plot_model_complexity_influence.py: ========================== Model Complexity Influence ========================== Demonstrate how model complexity influences both prediction accuracy and computational performance. The dataset is the Boston Housing dataset (resp. 20 Newsgroups) for regression (resp. classification). For each class of models we make the model complexity vary through the choice of relevant model parameters and measure the influence on both computational performance (latency) and predictive power (MSE or Hamming Loss). .. rst-class:: horizontal * .. image:: images/plot_model_complexity_influence_001.png :scale: 47 * .. image:: images/plot_model_complexity_influence_002.png :scale: 47 * .. image:: images/plot_model_complexity_influence_003.png :scale: 47 **Script output**:: Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.25, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 4538 | Hamming Loss (Misclassification Ratio): 0.2554 | Pred. Time: 0.027533s Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.5, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 1666 | Hamming Loss (Misclassification Ratio): 0.2920 | Pred. Time: 0.019870s Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.75, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 890 | Hamming Loss (Misclassification Ratio): 0.3175 | Pred. Time: 0.015984s Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.9, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 674 | Hamming Loss (Misclassification Ratio): 0.3334 | Pred. Time: 0.014499s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 69 | MSE: 31.8133 | Pred. Time: 0.000393s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.25, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 136 | MSE: 25.6140 | Pred. Time: 0.000726s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.5, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 243 | MSE: 22.3315 | Pred. Time: 0.001269s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.75, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 350 | MSE: 21.3679 | Pred. Time: 0.001808s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.9, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 404 | MSE: 21.0915 | Pred. Time: 0.002309s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 10 | MSE: 27.9672 | Pred. Time: 0.000061s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=50, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 50 | MSE: 8.0288 | Pred. Time: 0.000145s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 100 | MSE: 6.7578 | Pred. Time: 0.000241s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=200, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 200 | MSE: 5.8592 | Pred. Time: 0.000423s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=500, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 500 | MSE: 6.0492 | Pred. Time: 0.001017s **Python source code:** :download:`plot_model_complexity_influence.py ` .. literalinclude:: plot_model_complexity_influence.py :lines: 16- **Total running time of the example:** 20.01 seconds ( 0 minutes 20.01 seconds)