Model Interpretability: Understanding Predictions
Learn how to explain your ML model predictions using SHAP, LIME, and feature importance.
Model Interpretability: Understanding Predictions
A model that says "loan denied" isn't enough. You need to know WHY. Interpretability techniques explain model decisions.
Why Interpretability Matters
- Trust - Can you trust the model?
- Debugging - Why is it making mistakes?
- Compliance - Regulations require explanations
- Fairness - Is it using problematic features?
Level 1: Feature Importance
Built into tree-based models:
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
# Train model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# Get importance
importance = pd.DataFrame({
'feature': feature_names,
'importance': model.feature_importances_
}).sort_values('importance', ascending=False)
print(importance.head(10))
# Plot
importance.head(15).plot(x='feature', y='importance', kind='barh')
Limitation: Global importance, doesn't explain individual predictions.
Level 2: Permutation Importance
Works for any model:
from sklearn.inspection import permutation_importance
# Calculate
result = permutation_importance(model, X_test, y_test, n_repeats=10)
# Display
importance = pd.DataFrame({
'feature': feature_names,
'importance': result.importances_mean
}).sort_values('importance', ascending=False)
Logic: Shuffle each feature and see how much performance drops.
Level 3: SHAP (Best for Deep Explanation)
SHAP gives consistent, theoretically grounded explanations.
import shap
# Create explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
# Global importance plot
shap.summary_plot(shap_values, X_test, feature_names=feature_names)
# Force plot for single prediction
shap.force_plot(
explainer.expected_value[1],
shap_values[1][0],
X_test.iloc[0],
feature_names=feature_names
)
Understanding SHAP Values
Base value: 0.3 (average prediction)
Feature contributions:
Income: +0.15 (pushes toward positive)
Age: -0.05 (pushes toward negative)
Debt: +0.10 (pushes toward positive)
Final prediction: 0.3 + 0.15 - 0.05 + 0.10 = 0.5
SHAP Plots
# Summary plot: importance + direction
shap.summary_plot(shap_values, X_test)
# Dependence plot: how one feature affects predictions
shap.dependence_plot('income', shap_values, X_test)
# Waterfall for single prediction
shap.waterfall_plot(shap.Explanation(
values=shap_values[0],
base_values=explainer.expected_value,
data=X_test.iloc[0]
))
Level 4: LIME (Local Explanations)
Explains individual predictions by approximating locally with simple model:
from lime.lime_tabular import LimeTabularExplainer
# Create explainer
explainer = LimeTabularExplainer(
X_train.values,
feature_names=feature_names,
class_names=['No', 'Yes'],
mode='classification'
)
# Explain single prediction
exp = explainer.explain_instance(
X_test.iloc[0].values,
model.predict_proba,
num_features=10
)
# Show explanation
exp.show_in_notebook()
# Or get as list
exp.as_list()
SHAP vs LIME
| Aspect | SHAP | LIME |
|---|---|---|
| Consistency | Theoretically grounded | Approximation |
| Speed | Can be slow | Faster per instance |
| Global view | Yes | No (local only) |
| Interpretability | Additive | Rule-based |
Practical Workflow
# 1. Train your model
model.fit(X_train, y_train)
# 2. Global understanding with feature importance
print(pd.DataFrame({
'feature': feature_names,
'importance': model.feature_importances_
}).sort_values('importance', ascending=False).head(10))
# 3. Deep dive with SHAP
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values, X_test)
# 4. Explain specific predictions
idx = 0 # prediction to explain
shap.waterfall_plot(shap.Explanation(
values=shap_values[idx],
base_values=explainer.expected_value,
data=X_test.iloc[idx],
feature_names=feature_names
))
Key Takeaway
Start with built-in feature importance for quick insights. Use SHAP for thorough understanding - it gives both global feature importance and individual prediction explanations. Remember: a model you can't explain is a model you shouldn't trust in production. Always be able to answer "why did the model make this prediction?"