AI6 min read

Model Interpretability

Understand what your AI models are doing.

Robert Anderson
December 18, 2025
0.0k0

Explain AI decisions.

Why Interpretability Matters

Trust: Users need to trust AI
Debugging: Find model errors
Compliance: Regulations require explanations
Improvement: Understand to improve

Feature Importance

See which features matter most:

from sklearn.ensemble import RandomForestClassifier
import pandas as pd

model = RandomForestClassifier()
model.fit(X_train, y_train)

# Get feature importance
importances = model.feature_importances_
features = pd.DataFrame({
    'feature': X_train.columns,
    'importance': importances
}).sort_values('importance', ascending=False)

print(features)
# age          0.45
# income       0.30
# location     0.15
# ...

SHAP Values

Explain individual predictions:

import shap

# Create explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Explain single prediction
shap.force_plot(
    explainer.expected_value[1],
    shap_values[1][0],
    X_test.iloc[0]
)

# Summary of all predictions
shap.summary_plot(shap_values[1], X_test)

LIME

Local explanations:

from lime.lime_tabular import LimeTabularExplainer

# Create explainer
explainer = LimeTabularExplainer(
    X_train.values,
    feature_names=X_train.columns,
    class_names=['Not Approved', 'Approved']
)

# Explain one prediction
i = 0
exp = explainer.explain_instance(
    X_test.iloc[i].values,
    model.predict_proba
)

exp.show_in_notebook()

# Feature contributions
exp.as_list()
# [('age > 30', 0.35), ('income > 50k', 0.28), ...]

Partial Dependence Plots

Show feature effect:

from sklearn.inspection import plot_partial_dependence

plot_partial_dependence(
    model,
    X_train,
    features=['age', 'income'],
    grid_resolution=50
)

# Shows: How does prediction change with age/income?

Permutation Importance

from sklearn.inspection import permutation_importance

result = permutation_importance(
    model, X_test, y_test,
    n_repeats=10,
    random_state=42
)

importance_df = pd.DataFrame({
    'feature': X_train.columns,
    'importance': result.importances_mean
}).sort_values('importance', ascending=False)

print(importance_df)

For Neural Networks

import tensorflow as tf

# Gradient-based explanations
with tf.GradientTape() as tape:
    inputs = tf.constant(X_test[:1])
    tape.watch(inputs)
    predictions = model(inputs)

# Get gradients
gradients = tape.gradient(predictions, inputs)

# Features with high gradient = important

Model-Agnostic Methods

Work with any model:

Anchors: "If age > 30 AND income > 50k, then Approved"

Counterfactuals: "Change income from $40k to $55k → Approved"

Real Example - Loan Approval

# Train model
model.fit(X_train, y_train)

# Customer denied
customer = X_test.iloc[5]
prediction = model.predict([customer])[0]  # 0 = Denied

# Explain why
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values([customer])

# Output: 
# Age (-0.2): Too young
# Income (-0.3): Below threshold
# Credit score (+0.1): Good, but not enough

Remember

  • Always explain important decisions
  • SHAP for tree models
  • LIME for any model
  • Combine multiple methods
#AI#Intermediate#Interpretability