ML8 min read

Ensemble Stacking: Combining Multiple Models

Learn how stacking combines predictions from multiple models using a meta-learner.

Sarah Chen
December 19, 2025
0.0k0

Ensemble Stacking: Combining Multiple Models

Random Forest averages trees. Gradient Boosting chains them. Stacking goes further: train different model types, then train another model to combine their predictions.

Stacking Architecture

``` Level 0 (Base Models): ┌─────────────┬─────────────┬─────────────┐ │ Logistic │ Random │ SVM │ │ Regression │ Forest │ │ └──────┬──────┴──────┬──────┴──────┬──────┘ │ │ │ v v v pred_1 pred_2 pred_3 │ │ │ └─────────────┼─────────────┘ │ v Level 1 (Meta-Learner): ┌─────────────┐ │ XGBoost │ └──────┬──────┘ │ v Final Prediction ```

Why Stacking Works

Different models capture different patterns: - Linear models: linear relationships - Trees: non-linear interactions - SVMs: complex boundaries

The meta-learner learns when to trust each model.

Implementation with Scikit-Learn

```python from sklearn.ensemble import StackingClassifier from sklearn.linear_model import LogisticRegression from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier from sklearn.svm import SVC from sklearn.neighbors import KNeighborsClassifier

Base models base_models = [ ('lr', LogisticRegression(max_iter=1000)), ('rf', RandomForestClassifier(n_estimators=100)), ('svm', SVC(probability=True)), ('knn', KNeighborsClassifier()) ]

Meta-learner meta_model = GradientBoostingClassifier(n_estimators=50)

Stack them stacking_clf = StackingClassifier( estimators=base_models, final_estimator=meta_model, cv=5, # Cross-validation for base model predictions stack_method='predict_proba' # Use probabilities as features )

stacking_clf.fit(X_train, y_train) print(f"Accuracy: {stacking_clf.score(X_test, y_test):.3f}") ```

Manual Implementation (More Control)

```python from sklearn.model_selection import cross_val_predict import numpy as np

class SimpleStacker: def __init__(self, base_models, meta_model): self.base_models = base_models self.meta_model = meta_model def fit(self, X, y): # Get cross-validated predictions from base models base_predictions = [] for name, model in self.base_models: preds = cross_val_predict(model, X, y, cv=5, method='predict_proba') base_predictions.append(preds[:, 1]) # Probability of positive class # Create meta-features meta_features = np.column_stack(base_predictions) # Train base models on full data for name, model in self.base_models: model.fit(X, y) # Train meta-model self.meta_model.fit(meta_features, y) return self def predict(self, X): # Get predictions from base models base_predictions = [] for name, model in self.base_models: preds = model.predict_proba(X)[:, 1] base_predictions.append(preds) meta_features = np.column_stack(base_predictions) return self.meta_model.predict(meta_features) ```

Stacking for Regression

```python from sklearn.ensemble import StackingRegressor from sklearn.linear_model import Ridge, Lasso from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor

base_models = [ ('ridge', Ridge()), ('lasso', Lasso()), ('rf', RandomForestRegressor(n_estimators=100)) ]

stacking_reg = StackingRegressor( estimators=base_models, final_estimator=GradientBoostingRegressor(n_estimators=50), cv=5 )

stacking_reg.fit(X_train, y_train) ```

Tips for Better Stacking

### 1. Use Diverse Base Models

```python # Good: Different types base_models = [ ('linear', LogisticRegression()), # Linear ('tree', RandomForestClassifier()), # Tree-based ('svm', SVC(probability=True)), # Kernel-based ('nn', MLPClassifier()) # Neural network ]

Less good: All trees base_models = [ ('rf1', RandomForestClassifier(n_estimators=100)), ('rf2', RandomForestClassifier(n_estimators=200)), ('gb', GradientBoostingClassifier()) ] ```

### 2. Include Original Features (Optional)

```python stacking_clf = StackingClassifier( estimators=base_models, final_estimator=meta_model, passthrough=True # Include original features for meta-learner ) ```

### 3. Use Simple Meta-Learner

Complex meta-learners can overfit to the base predictions:

```python # Often works well meta_model = LogisticRegression()

Or simple tree model meta_model = GradientBoostingClassifier(n_estimators=50, max_depth=3) ```

When to Use Stacking

**Good for:** - Kaggle competitions (squeezing every bit of accuracy) - When different models excel at different subsets - Final boost after individual models are tuned

**Not ideal for:** - When interpretability is needed - Real-time predictions (slow) - When you need a quick solution

Key Takeaway

Stacking combines diverse models through a meta-learner that learns optimal weighting. Use diverse base models (different types, not just different hyperparameters), cross-validate to prevent data leakage, and keep the meta-learner simple. Stacking often gives 1-3% improvement over the best single model - small but can be decisive in competitions!

#Machine Learning#Ensemble Learning#Stacking#Intermediate