LIME & SHAP: Interpreting the Black Box

When using complex models like XGBoost, Random Forests, or Neural Networks, we often need to know: "Why did the model deny this specific loan?" or "What features are driving the predictions for all users?" LIME and SHAP are the industry standards for answering these questions.

1. LIME (Local Interpretable Model-Agnostic Explanations)

LIME works on the principle that while a model may be incredibly complex globally, it can be approximated by a simple, linear model locally around a specific data point.

How LIME Works:

Select a point: Choose the specific instance you want to explain.
Perturb the data: Create new "fake" samples by slightly varying the features of that point.
Get predictions: Run these fake samples through the "black box" model.
Weight the samples: Give more weight to samples that are closer to the original point.
Train a surrogate: Train a simple Linear Regression model on this weighted, perturbed dataset.
Explain: The coefficients of the linear model act as the explanation.

2. SHAP (SHapley Additive exPlanations)

SHAP is based on Shapley Values from cooperative game theory. It treats every feature of a model as a "player" in a game where the "payout" is the model's prediction.

The Core Concept:

SHAP calculates the contribution of each feature by comparing what the model predicts with the feature versus without it, across all possible combinations of features.

The result is "Additive":

f(x) = \text{base\_value} + \sum \text{shap\_values}

Base Value: The average prediction of the model across the training set.
SHAP Value: The amount a specific feature pushed the prediction higher or lower than the average.

3. LIME vs. SHAP: The Comparison

Feature	LIME	SHAP
Foundation	Local Linear Surrogates	Game Theory (Shapley Values)
Consistency	Can be unstable (results vary on perturbation)	Mathematically consistent and fair
Speed	Very Fast	Can be slow (computationally expensive)
Scope	Strictly Local	Both Local and Global

4. Visualization Logic

The following diagram illustrates the flow of SHAP from the model output down to the feature-level contribution.

5. Global Interpretation with SHAP

One of SHAP's greatest strengths is the Summary Plot. By aggregating the SHAP values of all points in a dataset, we can see:

Which features are the most important.
How the value of a feature (high vs. low) impacts the output.

6. Implementation Sketch (Python)

Both LIME and SHAP have robust Python libraries.

import shap
import lime
import lime.lime_tabular

# --- SHAP Example ---
explainer = shap.TreeExplainer(my_model)
shap_values = explainer.shap_values(X_test)

# Visualize the first prediction's explanation
shap.initjs()
shap.force_plot(explainer.expected_value, shap_values[0,:], X_test.iloc[0,:])

# --- LIME Example ---
explainer_lime = lime.lime_tabular.LimeTabularExplainer(
    X_train.values, feature_names=X_train.columns, class_names=['Negative', 'Positive']
)
exp = explainer_lime.explain_instance(X_test.values[0], my_model.predict_proba)
exp.show_in_notebook()

References

SHAP GitHub: Official Repository and Documentation
LIME Paper: "Why Should I Trust You?": Explaining the Predictions of Any Classifier
Distill.pub: Interpretable Machine Learning Guide

LIME and SHAP help us understand "Tabular" and "Text" data. But what if we are using CNNs for images? How do we know which pixels the model is looking at?

1. LIME (Local Interpretable Model-Agnostic Explanations)​

How LIME Works:​

2. SHAP (SHapley Additive exPlanations)​

The Core Concept:​

3. LIME vs. SHAP: The Comparison​

4. Visualization Logic​

5. Global Interpretation with SHAP​

6. Implementation Sketch (Python)​

References​