Quick Setup: SHAP and LIME in Five Minutes
Install both libraries and train a model you can actually explain:
| |
| |
That gives you a working credit scoring model. Now you need to answer the question every regulator, product manager, and end user asks: why did the model make that decision?
SHAP: Global and Local Explanations from Game Theory
SHAP assigns each feature an importance value for a specific prediction. It’s grounded in Shapley values from cooperative game theory — each feature is a “player” contributing to the prediction. The math is solid, and the API is straightforward.
TreeExplainer for XGBoost and LightGBM
TreeExplainer is the fast path. It’s polynomial-time for tree-based models instead of the exponential brute-force approach.
| |
The summary plot shows which features push predictions up or down across the entire test set. Red dots on the right mean high feature values increase the prediction. This is far more informative than raw feature importance because you see the direction of each feature’s effect.
Waterfall and Force Plots for Single Predictions
When a loan application gets denied, you need to explain that specific decision:
| |
The waterfall plot reads top-to-bottom: the base value (average model output) gets pushed up or down by each feature until you reach the final prediction. Force plots pack the same information into a horizontal bar.
KernelExplainer for Any Model
If you’re not using a tree model — maybe it’s a logistic regression, SVM, or some scikit-learn pipeline — KernelExplainer handles anything with a predict function. The tradeoff is speed: it’s model-agnostic but much slower.
| |
Use KernelExplainer when you have no other option. For tree models, always prefer TreeExplainer. For deep learning, use DeepExplainer.
DeepExplainer for Neural Networks
| |
DeepExplainer uses DeepLIFT under the hood. It’s faster than KernelExplainer for neural networks but slightly less exact. For most practical purposes the difference doesn’t matter.
LIME: Local Explanations for Individual Predictions
LIME takes a different approach. It perturbs the input, observes how predictions change, and fits a simple interpretable model (usually linear regression) around that single data point. The result is a local approximation that tells you which features mattered for this specific prediction.
| |
LIME explanations are intuitive — you get a list of features with positive or negative weights. Positive means the feature pushed toward the predicted class.
When to Use SHAP vs LIME
Pick SHAP when you need global explanations across the full dataset, when you want mathematically consistent feature attributions, or when you’re working with tree models (TreeExplainer is fast). SHAP values have nice theoretical properties: they always sum to the difference between the prediction and the base value.
Pick LIME when you need quick local explanations, when your stakeholders want a simple “these five features mattered” answer, or when you need model-agnostic explanations and speed isn’t critical. LIME is easier to explain to non-technical audiences.
In practice, use both. SHAP for your dashboards and monitoring. LIME for ad-hoc explanations in customer-facing tools.
Integrating Explanations into Production APIs
Don’t just generate plots during development. Ship explanations alongside predictions:
| |
This gives every API response a top_factors field explaining why the model made its decision. Cache the explainer — creating a new TreeExplainer per request is wasteful. For KernelExplainer, precompute the background dataset at startup and consider async processing since it’s slow.
Common Errors
shap.TreeExplainer raises XGBoostError: Invalid feature_names
This happens when you pass a numpy array but the model was trained with a DataFrame that had feature names. Fix it by passing a DataFrame with the same column names:
| |
LIME gives inconsistent explanations across runs
LIME is stochastic. It samples perturbations randomly, so two runs can produce different explanations. Set random_state in the constructor and increase num_samples to reduce variance:
| |
KernelExplainer is unbearably slow
The default behavior evaluates every possible feature coalition. Use a small background dataset and explain fewer samples at a time:
| |
SHAP summary plot is blank or crashes with matplotlib backend errors
This usually happens in headless environments (CI, Docker, SSH). Set the matplotlib backend before importing shap:
| |
Related Guides
- How to Build Adversarial Test Suites for ML Models
- How to Build Adversarial Robustness Testing for Vision Models
- How to Build Fairness-Aware ML Pipelines with Fairlearn
- How to Build Membership Inference Attack Detection for ML Models
- How to Build Automated Fairness Testing for LLM-Generated Content
- How to Build Hallucination Scoring and Grounding Verification for LLMs
- How to Build Automated Prompt Leakage Detection for LLM Apps
- How to Build Automated PII Redaction Testing for LLM Outputs
- How to Interpret LLM Decisions with Attention Visualization
- How to Build Output Grounding and Fact-Checking for LLM Apps