AI6 min read

Model Interpretability

Understand what your AI models are doing.

Robert Anderson
December 18, 2025
0.0k0

Explain AI decisions.

Why Interpretability Matters

**Trust**: Users need to trust AI **Debugging**: Find model errors **Compliance**: Regulations require explanations **Improvement**: Understand to improve

Feature Importance

See which features matter most:

```python from sklearn.ensemble import RandomForestClassifier import pandas as pd

model = RandomForestClassifier() model.fit(X_train, y_train)

Get feature importance importances = model.feature_importances_ features = pd.DataFrame({ 'feature': X_train.columns, 'importance': importances }).sort_values('importance', ascending=False)

print(features) # age 0.45 # income 0.30 # location 0.15 # ... ```

SHAP Values

Explain individual predictions:

```python import shap

Create explainer explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values(X_test)

Explain single prediction shap.force_plot( explainer.expected_value[1], shap_values[1][0], X_test.iloc[0] )

Summary of all predictions shap.summary_plot(shap_values[1], X_test) ```

LIME

Local explanations:

```python from lime.lime_tabular import LimeTabularExplainer

Create explainer explainer = LimeTabularExplainer( X_train.values, feature_names=X_train.columns, class_names=['Not Approved', 'Approved'] )

Explain one prediction i = 0 exp = explainer.explain_instance( X_test.iloc[i].values, model.predict_proba )

exp.show_in_notebook()

Feature contributions exp.as_list() # [('age > 30', 0.35), ('income > 50k', 0.28), ...] ```

Partial Dependence Plots

Show feature effect:

```python from sklearn.inspection import plot_partial_dependence

plot_partial_dependence( model, X_train, features=['age', 'income'], grid_resolution=50 )

Shows: How does prediction change with age/income? ```

Permutation Importance

```python from sklearn.inspection import permutation_importance

result = permutation_importance( model, X_test, y_test, n_repeats=10, random_state=42 )

importance_df = pd.DataFrame({ 'feature': X_train.columns, 'importance': result.importances_mean }).sort_values('importance', ascending=False)

print(importance_df) ```

For Neural Networks

```python import tensorflow as tf

Gradient-based explanations with tf.GradientTape() as tape: inputs = tf.constant(X_test[:1]) tape.watch(inputs) predictions = model(inputs)

Get gradients gradients = tape.gradient(predictions, inputs)

Features with high gradient = important ```

Model-Agnostic Methods

Work with any model:

**Anchors**: "If age > 30 AND income > 50k, then Approved"

**Counterfactuals**: "Change income from $40k to $55k → Approved"

Real Example - Loan Approval

```python # Train model model.fit(X_train, y_train)

Customer denied customer = X_test.iloc[5] prediction = model.predict([customer])[0] # 0 = Denied

Explain why explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values([customer])

Output: # Age (-0.2): Too young # Income (-0.3): Below threshold # Credit score (+0.1): Good, but not enough ```

Remember

- Always explain important decisions - SHAP for tree models - LIME for any model - Combine multiple methods

#AI#Intermediate#Interpretability