AI Interpretability with LIME and SHAP: A Practical Guide (2026)
By Learnia Team
AI Interpretability with LIME and SHAP: A Practical Guide
This article is written in English. Our training modules are available in multiple languages.
📚 This is Part 3 of the Responsible AI Engineering Series. After understanding alignment challenges and training techniques, this article covers how to inspect and explain what models have learned.
Table of Contents
- →Why Interpretability Matters
- →The Interpretability Landscape
- →LIME: Local Interpretable Explanations
- →SHAP: Game-Theoretic Feature Attribution
- →LIME vs SHAP: When to Use Which
- →Implementation Guide
- →Regulatory Requirements
- →Advanced Techniques
- →Common Pitfalls
- →FAQ
Master AI Prompting — €20 One-Time
Why Interpretability Matters
Machine learning models increasingly make decisions that affect people's lives—loan approvals, medical diagnoses, hiring decisions, content recommendations. Yet many of these models are "black boxes" that provide predictions without explanations.
The Black Box Problem
Example: A user submits a loan application with:
- →Age: 35
- →Income: $75,000
- →Debt: $20,000
- →Credit History: 5 years
The model returns: LOAN DENIED
❓ Question: Why was the loan denied?
🤷 Answer: The model provides no explanation.
This opacity creates problems:
| Stakeholder | Problem |
|---|---|
| Users | Can't understand or contest decisions |
| Developers | Can't debug or improve models |
| Regulators | Can't verify fairness or compliance |
| Organizations | Face legal and reputational risk |
Interpretability vs Explainability
These terms are often used interchangeably, but there's a distinction:
Interpretability: Understanding how a model works internally
- →What features matter?
- →How do features interact?
- →What patterns has the model learned?
Explainability: Communicating model behavior to humans
- →Why did the model make this prediction?
- →What would change the prediction?
- →Is this prediction trustworthy?
LIME and SHAP are primarily explainability tools—they help communicate why predictions were made, even if we don't fully understand the model's internal mechanisms.
The Interpretability Landscape
Before diving into LIME and SHAP, let's understand where they fit in the broader interpretability toolkit:
Types of Interpretability Methods
By Scope:
- →Global: Explain overall model behavior (feature importance, decision boundaries)
- →Local: Explain individual predictions (LIME, SHAP per-prediction)
By Timing:
- →Intrinsic: Built into model architecture (decision trees, linear models, attention weights)
- →Post-hoc: Applied after training (LIME, SHAP, saliency maps)
By Model Dependence:
- →Model-specific: Only work with certain models (Tree SHAP, attention visualization)
- →Model-agnostic: Work with any model (LIME, Kernel SHAP)
Key Methods Overview
| Method | Type | Approach | Best For |
|---|---|---|---|
| LIME | Local, Agnostic | Local linear approximation | Quick individual explanations |
| SHAP | Local+Global, Agnostic* | Shapley values | Rigorous feature attribution |
| Attention | Local, Intrinsic | Visualize attention weights | Transformers, NLP |
| Saliency Maps | Local, Specific | Input gradients | Image models |
| Feature Importance | Global, Specific | Permutation/Gini | Tree models |
| Partial Dependence | Global, Agnostic | Marginal effects | Feature relationships |
*SHAP has both model-agnostic (Kernel SHAP) and model-specific (Tree SHAP, Deep SHAP) implementations.
LIME: Local Interpretable Explanations
LIME (Local Interpretable Model-agnostic Explanations) was introduced by Ribeiro, Singh, and Guestrin in 2016. The core idea is elegantly simple:
"Intuitively, an explanation is a local linear approximation of the model's behaviour." — "Why Should I Trust You?": Explaining the Predictions of Any Classifier
The LIME Intuition
Complex models may have intricate global behavior, but locally—in the neighborhood of any single prediction—they're often approximately linear.
Global Model: The decision boundary is complex and non-linear, with curves and irregular patterns separating different classes.
Local Approximation: When we zoom in on a single prediction point (★), the complex boundary looks almost like a straight line. LIME exploits this by fitting a simple linear model just in that local neighborhood.
How LIME Works
LIME ALGORITHM:
INPUT:
- f: Black-box model to explain
- x: Instance to explain
- N: Number of samples to generate
OUTPUT:
- Explanation: Feature weights for this prediction
PROCESS:
1. PERTURB: Generate N samples near x
FOR i = 1 to N:
x'[i] = perturb(x) # Randomly modify features
y'[i] = f(x'[i]) # Get model predictions
w[i] = proximity(x, x'[i]) # Weight by distance to x
2. FIT: Train interpretable model on perturbed samples
g = train_linear_model(x', y', weights=w)
# g is a simple model (linear regression, decision tree)
# that approximates f locally around x
3. EXPLAIN: Extract feature contributions from g
explanation = g.coefficients # For linear model
RETURN explanation
LIME Pseudo-code Implementation
PSEUDO-CODE: LIME for Tabular Data
def explain_instance(model, instance, num_samples=5000):
"""
Explain a single prediction using LIME
Args:
model: Black-box model with predict() method
instance: Data point to explain (numpy array)
num_samples: Number of perturbed samples
Returns:
Dictionary mapping features to importance scores
"""
# Step 1: Generate perturbed samples
perturbations = []
predictions = []
weights = []
FOR i in range(num_samples):
# Create perturbed version
perturbed = instance.copy()
# Randomly turn features "on" or "off"
mask = random_binary_mask(len(instance))
perturbed = apply_mask(perturbed, mask, training_data)
perturbations.append(mask) # Store binary representation
predictions.append(model.predict(perturbed))
# Weight by similarity to original instance
distance = hamming_distance(mask, ones_vector)
weight = exp(-distance / kernel_width)
weights.append(weight)
# Step 2: Fit weighted linear model
X = array(perturbations)
y = array(predictions)
w = array(weights)
# Weighted ridge regression
linear_model = Ridge(alpha=1.0)
linear_model.fit(X, y, sample_weight=w)
# Step 3: Extract explanation
feature_names = get_feature_names()
explanation = {}
FOR i, coef in enumerate(linear_model.coef_):
explanation[feature_names[i]] = coef
RETURN explanation
# Example usage
instance = [35, 75000, 20000, 5] # Age, Income, Debt, History
explanation = explain_instance(loan_model, instance)
# Output:
# {
# "Age": 0.02,
# "Income": 0.45, # Strong positive
# "Debt": -0.38, # Strong negative
# "History": 0.15
# }
LIME for Different Data Types
Tabular Data: Perturb by replacing feature values with samples from training distribution
Text: Perturb by removing words and observing prediction changes
ORIGINAL: "This movie was absolutely fantastic and wonderful!"
PREDICTION: Positive (0.95)
PERTURBATIONS:
"This movie was absolutely [MASK] and wonderful!" -> 0.82
"This movie was [MASK] fantastic and wonderful!" -> 0.91
"This [MASK] was absolutely fantastic and [MASK]!" -> 0.78
...
EXPLANATION:
"fantastic" -> +0.25 (most important positive word)
"wonderful" -> +0.18
"absolutely" -> +0.08
Images: Perturb by masking superpixels (coherent regions)
ORIGINAL IMAGE: Cat photo
PREDICTION: Cat (0.92)
PERTURBATIONS:
[Mask ears] -> 0.71 # Ears matter
[Mask eyes] -> 0.65 # Eyes matter a lot
[Mask background] -> 0.89 # Background doesn't matter much
EXPLANATION: Heatmap showing ears and eyes as most important
SHAP: Game-Theoretic Feature Attribution
SHAP (SHapley Additive exPlanations) was introduced by Lundberg and Lee in 2017. It grounds feature attribution in game theory, providing theoretically consistent explanations.
"SHAP assigns each feature an importance value for a particular prediction." — A Unified Approach to Interpreting Model Predictions
Shapley Values Explained
SHAP is based on Shapley values from cooperative game theory. The intuition:
THE GAME: Predicting the output is a "game"
THE PLAYERS: Features are "players"
THE PAYOUT: Prediction value is the "payout"
QUESTION: How do we fairly distribute credit among players?
SHAPLEY'S ANSWER:
Consider every possible coalition (subset) of players.
For each coalition, measure each player's marginal contribution.
Average over all possible orderings.
FORMAL:
φᵢ = Σ |S|!(|N|-|S|-1)! / |N|! × [f(S ∪ {i}) - f(S)]
S⊆N\{i}
WHERE:
- φᵢ: Shapley value for feature i
- S: A subset of features not including i
- N: All features
- f(S): Model output with only features in S
Why Shapley Values?
Shapley values uniquely satisfy four desirable properties:
| Property | Meaning |
|---|---|
| Efficiency | Feature contributions sum to the prediction minus baseline |
| Symmetry | Equal features get equal attribution |
| Dummy | Irrelevant features get zero attribution |
| Linearity | Combining models combines attributions linearly |
No other attribution method satisfies all four properties.
SHAP Pseudo-code
PSEUDO-CODE: Kernel SHAP (Model-Agnostic)
def shap_values(model, instance, background_data, num_samples=2000):
"""
Compute SHAP values for an instance
Args:
model: Black-box model
instance: Data point to explain
background_data: Reference dataset for baseline
num_samples: Number of coalition samples
Returns:
SHAP values for each feature
"""
num_features = len(instance)
# Expected value (baseline prediction)
baseline = mean([model.predict(x) for x in background_data])
# Sample coalitions (subsets of features)
coalitions = []
predictions = []
weights = []
FOR i in range(num_samples):
# Random coalition (binary mask)
coalition_size = random_int(0, num_features)
coalition = random_subset(num_features, coalition_size)
# Create instance with coalition features from instance,
# non-coalition features from background
masked_instance = instance.copy()
background_sample = random_choice(background_data)
FOR j in range(num_features):
IF j not in coalition:
masked_instance[j] = background_sample[j]
coalitions.append(binary_mask(coalition, num_features))
predictions.append(model.predict(masked_instance))
# Shapley kernel weight
k = len(coalition)
IF k == 0 OR k == num_features:
weight = 1e6 # Very high weight for empty/full coalitions
ELSE:
weight = (num_features - 1) / (binomial(num_features, k) * k * (num_features - k))
weights.append(weight)
# Solve weighted linear regression
X = array(coalitions)
y = array(predictions) - baseline
w = array(weights)
# Constraint: coefficients must sum to (prediction - baseline)
model_prediction = model.predict(instance)
shap_values = weighted_constrained_regression(X, y, w,
sum_constraint=model_prediction - baseline)
RETURN shap_values, baseline
# Example output
shap_vals, base = shap_values(loan_model, applicant, training_data)
# Interpretation:
# Base prediction: 0.60 (average approval probability)
#
# SHAP values:
# Income: +0.25 (income increases approval by 0.25)
# Debt: -0.15 (debt decreases approval by 0.15)
# Age: +0.03 (age slightly increases approval)
# History: +0.07 (history increases approval)
#
# Final prediction: 0.60 + 0.25 - 0.15 + 0.03 + 0.07 = 0.80
SHAP Visualization Types
1. Force Plot: Shows how features push prediction from base value
Starting from the base prediction (0.60), each feature either pushes the prediction up or down:
- →Income: +0.25 (pushes up)
- →History: +0.07 (pushes up)
- →Age: +0.03 (pushes up)
- →Debt: -0.15 (pushes down)
- →Final prediction: 0.80
2. Summary Plot: Global view of feature importance across all predictions
Shows the distribution of SHAP values for each feature across the dataset, revealing which features have the most impact overall.
3. Dependence Plot: How a feature's value affects its SHAP value
Shows the relationship between a feature's actual value (x-axis) and its SHAP value (y-axis), revealing non-linear relationships.
Tree SHAP: Fast Exact Computation
For tree-based models (Random Forest, XGBoost, LightGBM), Tree SHAP computes exact Shapley values in polynomial time:
| Algorithm | Complexity | Notes |
|---|---|---|
| Kernel SHAP | O(2^n) | Exponential in features |
| Tree SHAP | O(TLD²) | Polynomial - much faster |
Where: T = Number of trees, L = Maximum leaves, D = Maximum depth
Example: For a model with 100 trees, depth 10, and 20 features:
- →Kernel SHAP: ~1 million evaluations
- →Tree SHAP: ~100,000 evaluations (10x faster)
LIME vs SHAP: When to Use Which
Comparison Table
| Aspect | LIME | SHAP |
|---|---|---|
| Theoretical foundation | Intuitive, ad-hoc | Game theory (Shapley values) |
| Consistency | Can vary with random seed | Deterministic (with same background) |
| Additivity | Features don't sum to prediction | Features sum to prediction - baseline |
| Computation | Fast (single regression) | Slower (many evaluations) |
| Global explanations | Not built-in | Summary plots, interactions |
| Model-specific speedups | No | Yes (Tree SHAP, Deep SHAP) |
| Interpretability | Very intuitive | Requires understanding Shapley |
| Implementation | Simple | More complex |
Decision Framework
Use LIME when:
- →You need quick, intuitive explanations
- →Exact consistency isn't critical
- →You're explaining to non-technical stakeholders
- →You're working with text or images
- →You're prototyping or exploring
Use SHAP when:
- →You need theoretically grounded explanations
- →Consistency across explanations matters
- →You want global + local explanations
- →You're working with tree-based models (Tree SHAP is fast)
- →You need to satisfy regulatory requirements
- →You need features to sum to prediction
Use both when:
- →You want to validate explanations
- →Different stakeholders need different views
- →You're building a comprehensive explanation system
Practical Recommendations
| Scenario | Recommendation | Reason |
|---|---|---|
| Explaining loan decisions to applicants | LIME | Simple, intuitive explanations for non-technical users |
| Auditing model fairness for regulators | SHAP | Consistent, additive, theoretically grounded |
| Debugging XGBoost model predictions | Tree SHAP | Fast, exact, shows feature interactions |
| Explaining image classification to researchers | Both LIME + Gradient-based | Different methods highlight different patterns |
| Production system with latency constraints | LIME or precomputed SHAP | LIME is faster; SHAP can be cached |
Implementation Guide
Setting Up LIME
PSEUDO-CODE: LIME Setup and Usage
# Installation (conceptual)
# pip install lime
# For tabular data
class LIMETabularExplainer:
def __init__(self, training_data, feature_names, class_names):
"""
Initialize LIME explainer with training data context
"""
self.training_data = training_data
self.feature_names = feature_names
self.class_names = class_names
# Compute statistics for perturbation
self.means = compute_means(training_data)
self.stds = compute_stds(training_data)
self.feature_types = infer_types(training_data)
def explain_instance(self, instance, predict_fn, num_features=10):
"""
Generate explanation for a single instance
"""
# Generate perturbed samples
samples = self.generate_perturbations(instance, n=5000)
# Get predictions for samples
predictions = predict_fn(samples)
# Compute sample weights
weights = self.compute_weights(instance, samples)
# Fit local linear model
explanation = self.fit_local_model(samples, predictions, weights)
# Return top features
RETURN explanation.top_features(num_features)
# Usage example
explainer = LIMETabularExplainer(
training_data=X_train,
feature_names=['age', 'income', 'debt', 'history'],
class_names=['denied', 'approved']
)
explanation = explainer.explain_instance(
instance=applicant,
predict_fn=model.predict_proba,
num_features=4
)
# Display
print("Prediction: Approved (0.80)")
print("Explanation:")
FOR feature, weight in explanation:
direction = "↑" if weight > 0 else "↓"
print(f" {feature}: {direction} {abs(weight):.3f}")
# Output:
# Prediction: Approved (0.80)
# Explanation:
# income: ↑ 0.342
# debt: ↓ 0.256
# history: ↑ 0.124
# age: ↑ 0.045
Setting Up SHAP
PSEUDO-CODE: SHAP Setup and Usage
# Installation (conceptual)
# pip install shap
# For any model (Kernel SHAP)
class KernelSHAPExplainer:
def __init__(self, predict_fn, background_data):
"""
Initialize SHAP explainer with background data
background_data: Reference dataset (typically 100-1000 samples)
"""
self.predict_fn = predict_fn
self.background = background_data
self.expected_value = mean(predict_fn(background_data))
def explain(self, instances):
"""
Compute SHAP values for instances
"""
shap_values = []
FOR instance in instances:
values = self.compute_shap_values(instance)
shap_values.append(values)
RETURN array(shap_values)
def compute_shap_values(self, instance):
# Implements Kernel SHAP algorithm
# (See pseudo-code in SHAP section above)
...
# For tree models (Tree SHAP - much faster)
class TreeSHAPExplainer:
def __init__(self, tree_model):
"""
Initialize with tree-based model
Supports: XGBoost, LightGBM, RandomForest, etc.
"""
self.model = tree_model
self.expected_value = self.compute_base_value()
def explain(self, instances):
"""
Compute exact SHAP values using tree structure
"""
# Uses polynomial-time algorithm
RETURN self.tree_shap_algorithm(instances)
# Usage example
explainer = TreeSHAPExplainer(xgboost_model)
shap_values = explainer.explain(X_test)
# Visualization
def plot_summary(shap_values, X_test, feature_names):
"""
Create summary plot showing feature importance
"""
# Sort features by mean absolute SHAP value
importance = mean(abs(shap_values), axis=0)
sorted_idx = argsort(importance)[::-1]
FOR idx in sorted_idx[:10]:
print(f"{feature_names[idx]}: {importance[idx]:.4f}")
# Plot distribution of SHAP values for this feature
plot_beeswarm(shap_values[:, idx], X_test[:, idx])
# Force plot for single prediction
def plot_force(shap_values, instance, expected_value, feature_names):
"""
Show how features push prediction from base value
"""
prediction = expected_value + sum(shap_values)
print(f"Base value: {expected_value:.3f}")
print(f"Prediction: {prediction:.3f}")
print("\nFeature contributions:")
FOR i, (name, value) in enumerate(zip(feature_names, shap_values)):
IF abs(value) > 0.01: # Only show significant features
arrow = "→↑" if value > 0 else "→↓"
print(f" {name}: {arrow} {value:+.3f}")
Production Considerations
PRODUCTION CHECKLIST:
1. CACHING
- Precompute SHAP values for common cases
- Cache background data statistics
- Store explainer objects between requests
2. LATENCY
- LIME: ~100-500ms per explanation
- Kernel SHAP: ~1-5s per explanation
- Tree SHAP: ~10-50ms per explanation
If latency matters, use Tree SHAP or precompute
3. MEMORY
- Background data for SHAP: ~1000 samples typical
- LIME training data statistics: ~10KB per feature
Consider sampling for large datasets
4. CONSISTENCY
- Set random seeds for reproducible LIME
- Use consistent background data for SHAP
- Document explanation methodology
5. MONITORING
- Log explanation distributions over time
- Alert on unexpected feature importance changes
- Track explanation generation failures
Regulatory Requirements
EU AI Act Explainability Requirements
The EU AI Act (effective 2024-2026) mandates explainability for high-risk AI systems:
"It is often not possible to find out why an AI system has made a decision or prediction... So, it may become difficult to assess whether someone has been unfairly disadvantaged." — EU AI Act Recitals
Compliance Requirements
HIGH-RISK AI SYSTEMS MUST:
1. TRANSPARENCY
- Provide clear information about AI use
- Explain the logic involved in decision-making
- Inform affected persons of their rights
2. DOCUMENTATION
- Maintain logs of AI decisions
- Document explanation methodology
- Record feature importance for audits
3. HUMAN OVERSIGHT
- Enable human understanding of AI outputs
- Allow intervention in automated decisions
- Provide meaningful human review
4. AFFECTED PERSON RIGHTS
- Right to explanation for automated decisions
- Right to human review
- Right to contest AI decisions
LIME/SHAP for Compliance
COMPLIANCE STRATEGY:
FOR each high-risk AI decision:
1. GENERATE EXPLANATION
explanation = shap_explainer.explain(instance)
# or
explanation = lime_explainer.explain_instance(instance)
2. LOG FOR AUDIT
audit_log.record({
"timestamp": now(),
"decision_id": unique_id,
"prediction": model_output,
"explanation": explanation,
"top_features": explanation.top(5),
"model_version": model.version
})
3. PRESENT TO USER (if requested)
user_explanation = format_for_humans(explanation)
# Example output:
# "Your loan application was assessed based primarily on:
# - Your income level (positive factor)
# - Your current debt (negative factor)
# - Your credit history length (positive factor)
#
# You may request human review of this decision."
4. ENABLE CONTESTATION
IF user.requests_review():
route_to_human_reviewer(decision_id, explanation)
NIST AI Risk Management Framework
The NIST AI RMF provides guidance on interpretability:
NIST AI RMF Functions:
| Function | Interpretability Requirements |
|---|---|
| GOVERN | Establish interpretability requirements by risk level |
| MAP | Identify where explanations are needed; Define stakeholders who need explanations |
| MEASURE | Evaluate explanation quality and consistency; Test explanation faithfulness to model |
| MANAGE | Implement explanation systems; Monitor explanation drift; Update methodologies as needed |
Advanced Techniques
SHAP Interaction Values
Beyond individual feature attributions, SHAP can measure feature interactions:
SHAP INTERACTIONS:
Standard SHAP: How much does feature i contribute?
Interaction SHAP: How much do features i and j contribute together,
beyond their individual contributions?
PSEUDO-CODE:
interaction_values = shap_explainer.shap_interaction_values(X)
# interaction_values[sample, feature_i, feature_j]
# Diagonal: main effects
# Off-diagonal: interaction effects
EXAMPLE:
# Age alone: +0.05
# Income alone: +0.20
# Age × Income interaction: +0.10
#
# Interpretation: High income matters more for younger applicants
Anchors: Rule-Based Explanations
Anchors complement LIME/SHAP with rule-based explanations:
LIME EXPLANATION:
"Income contributed +0.34 to approval probability"
ANCHOR EXPLANATION:
"IF income > 60000 AND debt < 15000 THEN approved
(with 95% precision)"
PSEUDO-CODE:
def find_anchor(model, instance, precision_threshold=0.95):
"""
Find minimal rule that guarantees prediction
"""
rules = []
current_precision = 0
WHILE current_precision < precision_threshold:
# Add most informative rule
best_rule = find_best_rule(instance, rules, model)
rules.append(best_rule)
# Measure precision of current rule set
current_precision = evaluate_precision(rules, model)
RETURN rules
Counterfactual Explanations
What minimal change would flip the prediction?
COUNTERFACTUAL EXPLANATION:
Original: Loan DENIED
"If your income were $65,000 instead of $50,000,
your loan would be APPROVED"
PSEUDO-CODE:
def find_counterfactual(model, instance, target_class):
"""
Find minimal perturbation that changes prediction
"""
# Start from original instance
counterfactual = instance.copy()
# Optimize to flip prediction with minimal change
FOR iteration in range(max_iterations):
# Compute gradient toward target class
gradient = compute_gradient(model, counterfactual, target_class)
# Update counterfactual
counterfactual += learning_rate * gradient
# Encourage minimal changes
counterfactual = project_to_valid_range(counterfactual)
IF model.predict(counterfactual) == target_class:
break
# Report changes
changes = []
FOR i, (orig, new) in enumerate(zip(instance, counterfactual)):
IF abs(orig - new) > threshold:
changes.append((feature_names[i], orig, new))
RETURN changes
Common Pitfalls
Pitfall 1: Treating Explanations as Ground Truth
PROBLEM:
Explanations are approximations, not the actual model logic.
LIME and SHAP can disagree, and both can be wrong.
MITIGATION:
- Use multiple explanation methods
- Validate explanations with domain experts
- Test explanation faithfulness (do features actually matter?)
Pitfall 2: Ignoring Feature Correlation
PROBLEM:
When features are correlated, attribution can be distributed
arbitrarily between them.
EXAMPLE:
- height and weight are correlated
- SHAP might attribute importance to one arbitrarily
- The "true" importance is shared
MITIGATION:
- Use SHAP interaction values
- Group correlated features
- Be cautious interpreting individual correlated features
Pitfall 3: Wrong Background Data (SHAP)
PROBLEM:
SHAP explanations depend on background (reference) data.
Wrong background = wrong explanations.
BAD:
background = entire_training_set # May include irrelevant subgroups
GOOD:
background = relevant_subpopulation # E.g., same demographic
MITIGATION:
- Choose background data carefully
- Consider multiple reference points
- Document background data choice
Pitfall 4: Instability (LIME)
PROBLEM:
LIME explanations can vary with random seed.
Running twice may give different answers.
MITIGATION:
- Set random seed for reproducibility
- Run multiple times and average
- Use SHAP if consistency is critical
- Report confidence intervals
Pitfall 5: Computational Cost (SHAP)
PROBLEM:
Kernel SHAP is expensive: O(2^n) for n features
SYMPTOMS:
- Explanation takes minutes
- Memory errors for large datasets
- Production latency issues
MITIGATION:
- Use Tree SHAP for tree models (O(TLD²))
- Limit number of background samples
- Precompute explanations offline
- Sample features for high-dimensional data
FAQ
Q: Are LIME and SHAP faithful to the model? A: Not perfectly. LIME is a local approximation that may miss non-linearities. SHAP is theoretically consistent but can give misleading attributions for correlated features. Always validate with domain knowledge and multiple methods.
Q: Can I use LIME/SHAP for deep learning? A: Yes, but with caveats. Kernel SHAP works with any model but is slow. Deep SHAP uses gradient-based approximations. For images, consider saliency maps or integrated gradients in addition to LIME.
Q: How many background samples do I need for SHAP? A: Typically 100-1000 samples. More is better for accuracy but slower. Diminishing returns after ~1000. Ensure background is representative of your data distribution.
Q: Do I need explanations for every prediction? A: Not necessarily. Consider: (1) High-stakes decisions need explanations, (2) Regulatory requirements may mandate logging, (3) On-demand explanations may suffice for low-risk cases.
Q: How do I explain to non-technical users? A: Focus on: (1) What factors mattered most, (2) Which way each factor pushed the decision, (3) What changes might lead to different outcomes. Avoid technical jargon like "SHAP values."
Q: Can explanations be gamed or manipulated? A: Yes. Adversarial examples exist for explanations. Someone could create inputs that give misleading explanations. Monitor for unusual patterns and use multiple explanation methods.
Conclusion
Interpretability is essential for responsible AI deployment. LIME and SHAP provide complementary approaches to understanding model predictions, each with distinct strengths.
Key Takeaways:
- →LIME is fast and intuitive — Best for quick local explanations and non-technical stakeholders
- →SHAP is rigorous and consistent — Best for compliance, debugging, and theoretical soundness
- →Use both when possible — Different methods highlight different patterns
- →Explanations are approximations — Validate with domain knowledge
- →Regulatory requirements are growing — Plan for explainability from the start
As AI systems become more prevalent in high-stakes decisions, interpretability moves from "nice to have" to essential requirement. LIME and SHAP are foundational tools in this landscape.
📚 Responsible AI Series
| Part | Article | Status |
|---|---|---|
| 1 | Understanding AI Alignment | ✓ |
| 2 | RLHF & Constitutional AI | ✓ |
| 3 | AI Interpretability with LIME & SHAP (You are here) | ✓ |
| 4 | Automated Red Teaming with PyRIT | Coming Soon |
| 5 | AI Runtime Governance & Circuit Breakers | Coming Soon |
← Previous: RLHF & Constitutional AI
Next →: Automated Red Teaming with PyRIT
🚀 Ready to Master Responsible AI?
Our training modules cover practical implementation of AI safety techniques, from prompt engineering to production governance.
📚 Explore Our Training Modules | Start Module 0
References:
- →Ribeiro, Singh, Guestrin (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier
- →Lundberg, Lee (2017). A Unified Approach to Interpreting Model Predictions
- →LIME Documentation
- →SHAP Documentation
- →NIST AI Risk Management Framework
- →EU AI Act
Last Updated: January 29, 2026
Part 3 of the Responsible AI Engineering Series
Module 0 — Prompting Fundamentals
Build your first effective prompts from scratch with hands-on exercises.