AI6 min read
Model Interpretability
Understand what your AI models are doing.
Robert Anderson
December 18, 2025
0.0k0
Explain AI decisions.
Why Interpretability Matters
Trust: Users need to trust AI
Debugging: Find model errors
Compliance: Regulations require explanations
Improvement: Understand to improve
Feature Importance
See which features matter most:
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
model = RandomForestClassifier()
model.fit(X_train, y_train)
# Get feature importance
importances = model.feature_importances_
features = pd.DataFrame({
'feature': X_train.columns,
'importance': importances
}).sort_values('importance', ascending=False)
print(features)
# age 0.45
# income 0.30
# location 0.15
# ...
SHAP Values
Explain individual predictions:
import shap
# Create explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
# Explain single prediction
shap.force_plot(
explainer.expected_value[1],
shap_values[1][0],
X_test.iloc[0]
)
# Summary of all predictions
shap.summary_plot(shap_values[1], X_test)
LIME
Local explanations:
from lime.lime_tabular import LimeTabularExplainer
# Create explainer
explainer = LimeTabularExplainer(
X_train.values,
feature_names=X_train.columns,
class_names=['Not Approved', 'Approved']
)
# Explain one prediction
i = 0
exp = explainer.explain_instance(
X_test.iloc[i].values,
model.predict_proba
)
exp.show_in_notebook()
# Feature contributions
exp.as_list()
# [('age > 30', 0.35), ('income > 50k', 0.28), ...]
Partial Dependence Plots
Show feature effect:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(
model,
X_train,
features=['age', 'income'],
grid_resolution=50
)
# Shows: How does prediction change with age/income?
Permutation Importance
from sklearn.inspection import permutation_importance
result = permutation_importance(
model, X_test, y_test,
n_repeats=10,
random_state=42
)
importance_df = pd.DataFrame({
'feature': X_train.columns,
'importance': result.importances_mean
}).sort_values('importance', ascending=False)
print(importance_df)
For Neural Networks
import tensorflow as tf
# Gradient-based explanations
with tf.GradientTape() as tape:
inputs = tf.constant(X_test[:1])
tape.watch(inputs)
predictions = model(inputs)
# Get gradients
gradients = tape.gradient(predictions, inputs)
# Features with high gradient = important
Model-Agnostic Methods
Work with any model:
Anchors: "If age > 30 AND income > 50k, then Approved"
Counterfactuals: "Change income from $40k to $55k → Approved"
Real Example - Loan Approval
# Train model
model.fit(X_train, y_train)
# Customer denied
customer = X_test.iloc[5]
prediction = model.predict([customer])[0] # 0 = Denied
# Explain why
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values([customer])
# Output:
# Age (-0.2): Too young
# Income (-0.3): Below threshold
# Credit score (+0.1): Good, but not enough
Remember
- Always explain important decisions
- SHAP for tree models
- LIME for any model
- Combine multiple methods
#AI#Intermediate#Interpretability