Retour aux articles
21 MIN READ

AI Interpretability with LIME and SHAP: A Practical Guide (2026)

By Learnia Team

AI Interpretability with LIME and SHAP: A Practical Guide

This article is written in English. Our training modules are available in multiple languages.

📚 This is Part 3 of the Responsible AI Engineering Series. After understanding alignment challenges and training techniques, this article covers how to inspect and explain what models have learned.


Table of Contents

  1. Why Interpretability Matters
  2. The Interpretability Landscape
  3. LIME: Local Interpretable Explanations
  4. SHAP: Game-Theoretic Feature Attribution
  5. LIME vs SHAP: When to Use Which
  6. Implementation Guide
  7. Regulatory Requirements
  8. Advanced Techniques
  9. Common Pitfalls
  10. FAQ

Master AI Prompting — €20 One-Time

10 ModulesLifetime Access
Get Full Access

Why Interpretability Matters

Machine learning models increasingly make decisions that affect people's lives—loan approvals, medical diagnoses, hiring decisions, content recommendations. Yet many of these models are "black boxes" that provide predictions without explanations.

The Black Box Problem

Example: A user submits a loan application with:

  • Age: 35
  • Income: $75,000
  • Debt: $20,000
  • Credit History: 5 years

The model returns: LOAN DENIED

Question: Why was the loan denied?
🤷 Answer: The model provides no explanation.

This opacity creates problems:

StakeholderProblem
UsersCan't understand or contest decisions
DevelopersCan't debug or improve models
RegulatorsCan't verify fairness or compliance
OrganizationsFace legal and reputational risk

Interpretability vs Explainability

These terms are often used interchangeably, but there's a distinction:

Interpretability: Understanding how a model works internally

  • What features matter?
  • How do features interact?
  • What patterns has the model learned?

Explainability: Communicating model behavior to humans

  • Why did the model make this prediction?
  • What would change the prediction?
  • Is this prediction trustworthy?

LIME and SHAP are primarily explainability tools—they help communicate why predictions were made, even if we don't fully understand the model's internal mechanisms.


The Interpretability Landscape

Before diving into LIME and SHAP, let's understand where they fit in the broader interpretability toolkit:

Types of Interpretability Methods

By Scope:

  • Global: Explain overall model behavior (feature importance, decision boundaries)
  • Local: Explain individual predictions (LIME, SHAP per-prediction)

By Timing:

  • Intrinsic: Built into model architecture (decision trees, linear models, attention weights)
  • Post-hoc: Applied after training (LIME, SHAP, saliency maps)

By Model Dependence:

  • Model-specific: Only work with certain models (Tree SHAP, attention visualization)
  • Model-agnostic: Work with any model (LIME, Kernel SHAP)

Key Methods Overview

MethodTypeApproachBest For
LIMELocal, AgnosticLocal linear approximationQuick individual explanations
SHAPLocal+Global, Agnostic*Shapley valuesRigorous feature attribution
AttentionLocal, IntrinsicVisualize attention weightsTransformers, NLP
Saliency MapsLocal, SpecificInput gradientsImage models
Feature ImportanceGlobal, SpecificPermutation/GiniTree models
Partial DependenceGlobal, AgnosticMarginal effectsFeature relationships

*SHAP has both model-agnostic (Kernel SHAP) and model-specific (Tree SHAP, Deep SHAP) implementations.


LIME: Local Interpretable Explanations

LIME (Local Interpretable Model-agnostic Explanations) was introduced by Ribeiro, Singh, and Guestrin in 2016. The core idea is elegantly simple:

"Intuitively, an explanation is a local linear approximation of the model's behaviour." — "Why Should I Trust You?": Explaining the Predictions of Any Classifier

The LIME Intuition

Complex models may have intricate global behavior, but locally—in the neighborhood of any single prediction—they're often approximately linear.

Global Model: The decision boundary is complex and non-linear, with curves and irregular patterns separating different classes.

Local Approximation: When we zoom in on a single prediction point (★), the complex boundary looks almost like a straight line. LIME exploits this by fitting a simple linear model just in that local neighborhood.

How LIME Works

LIME ALGORITHM:

INPUT: 
    - f: Black-box model to explain
    - x: Instance to explain
    - N: Number of samples to generate

OUTPUT:
    - Explanation: Feature weights for this prediction

PROCESS:

1. PERTURB: Generate N samples near x
   FOR i = 1 to N:
       x'[i] = perturb(x)  # Randomly modify features
       y'[i] = f(x'[i])    # Get model predictions
       w[i] = proximity(x, x'[i])  # Weight by distance to x

2. FIT: Train interpretable model on perturbed samples
   g = train_linear_model(x', y', weights=w)
   
   # g is a simple model (linear regression, decision tree)
   # that approximates f locally around x

3. EXPLAIN: Extract feature contributions from g
   explanation = g.coefficients  # For linear model
   
   RETURN explanation

LIME Pseudo-code Implementation

PSEUDO-CODE: LIME for Tabular Data

def explain_instance(model, instance, num_samples=5000):
    """
    Explain a single prediction using LIME
    
    Args:
        model: Black-box model with predict() method
        instance: Data point to explain (numpy array)
        num_samples: Number of perturbed samples
    
    Returns:
        Dictionary mapping features to importance scores
    """
    
    # Step 1: Generate perturbed samples
    perturbations = []
    predictions = []
    weights = []
    
    FOR i in range(num_samples):
        # Create perturbed version
        perturbed = instance.copy()
        
        # Randomly turn features "on" or "off"
        mask = random_binary_mask(len(instance))
        perturbed = apply_mask(perturbed, mask, training_data)
        
        perturbations.append(mask)  # Store binary representation
        predictions.append(model.predict(perturbed))
        
        # Weight by similarity to original instance
        distance = hamming_distance(mask, ones_vector)
        weight = exp(-distance / kernel_width)
        weights.append(weight)
    
    # Step 2: Fit weighted linear model
    X = array(perturbations)
    y = array(predictions)
    w = array(weights)
    
    # Weighted ridge regression
    linear_model = Ridge(alpha=1.0)
    linear_model.fit(X, y, sample_weight=w)
    
    # Step 3: Extract explanation
    feature_names = get_feature_names()
    explanation = {}
    
    FOR i, coef in enumerate(linear_model.coef_):
        explanation[feature_names[i]] = coef
    
    RETURN explanation


# Example usage
instance = [35, 75000, 20000, 5]  # Age, Income, Debt, History
explanation = explain_instance(loan_model, instance)

# Output:
# {
#   "Age": 0.02,
#   "Income": 0.45,      # Strong positive
#   "Debt": -0.38,       # Strong negative
#   "History": 0.15
# }

LIME for Different Data Types

Tabular Data: Perturb by replacing feature values with samples from training distribution

Text: Perturb by removing words and observing prediction changes

ORIGINAL: "This movie was absolutely fantastic and wonderful!"
PREDICTION: Positive (0.95)

PERTURBATIONS:
"This movie was absolutely [MASK] and wonderful!" -> 0.82
"This movie was [MASK] fantastic and wonderful!" -> 0.91
"This [MASK] was absolutely fantastic and [MASK]!" -> 0.78
...

EXPLANATION:
"fantastic" -> +0.25 (most important positive word)
"wonderful" -> +0.18
"absolutely" -> +0.08

Images: Perturb by masking superpixels (coherent regions)

ORIGINAL IMAGE: Cat photo
PREDICTION: Cat (0.92)

PERTURBATIONS:
[Mask ears] -> 0.71      # Ears matter
[Mask eyes] -> 0.65      # Eyes matter a lot
[Mask background] -> 0.89  # Background doesn't matter much

EXPLANATION: Heatmap showing ears and eyes as most important

SHAP: Game-Theoretic Feature Attribution

SHAP (SHapley Additive exPlanations) was introduced by Lundberg and Lee in 2017. It grounds feature attribution in game theory, providing theoretically consistent explanations.

"SHAP assigns each feature an importance value for a particular prediction." — A Unified Approach to Interpreting Model Predictions

Shapley Values Explained

SHAP is based on Shapley values from cooperative game theory. The intuition:

THE GAME: Predicting the output is a "game"
THE PLAYERS: Features are "players"
THE PAYOUT: Prediction value is the "payout"

QUESTION: How do we fairly distribute credit among players?

SHAPLEY'S ANSWER: 
Consider every possible coalition (subset) of players.
For each coalition, measure each player's marginal contribution.
Average over all possible orderings.

FORMAL:
φᵢ = Σ |S|!(|N|-|S|-1)! / |N|! × [f(S ∪ {i}) - f(S)]
     S⊆N\{i}

WHERE:
- φᵢ: Shapley value for feature i
- S: A subset of features not including i
- N: All features
- f(S): Model output with only features in S

Why Shapley Values?

Shapley values uniquely satisfy four desirable properties:

PropertyMeaning
EfficiencyFeature contributions sum to the prediction minus baseline
SymmetryEqual features get equal attribution
DummyIrrelevant features get zero attribution
LinearityCombining models combines attributions linearly

No other attribution method satisfies all four properties.

SHAP Pseudo-code

PSEUDO-CODE: Kernel SHAP (Model-Agnostic)

def shap_values(model, instance, background_data, num_samples=2000):
    """
    Compute SHAP values for an instance
    
    Args:
        model: Black-box model
        instance: Data point to explain
        background_data: Reference dataset for baseline
        num_samples: Number of coalition samples
    
    Returns:
        SHAP values for each feature
    """
    
    num_features = len(instance)
    
    # Expected value (baseline prediction)
    baseline = mean([model.predict(x) for x in background_data])
    
    # Sample coalitions (subsets of features)
    coalitions = []
    predictions = []
    weights = []
    
    FOR i in range(num_samples):
        # Random coalition (binary mask)
        coalition_size = random_int(0, num_features)
        coalition = random_subset(num_features, coalition_size)
        
        # Create instance with coalition features from instance,
        # non-coalition features from background
        masked_instance = instance.copy()
        background_sample = random_choice(background_data)
        
        FOR j in range(num_features):
            IF j not in coalition:
                masked_instance[j] = background_sample[j]
        
        coalitions.append(binary_mask(coalition, num_features))
        predictions.append(model.predict(masked_instance))
        
        # Shapley kernel weight
        k = len(coalition)
        IF k == 0 OR k == num_features:
            weight = 1e6  # Very high weight for empty/full coalitions
        ELSE:
            weight = (num_features - 1) / (binomial(num_features, k) * k * (num_features - k))
        weights.append(weight)
    
    # Solve weighted linear regression
    X = array(coalitions)
    y = array(predictions) - baseline
    w = array(weights)
    
    # Constraint: coefficients must sum to (prediction - baseline)
    model_prediction = model.predict(instance)
    
    shap_values = weighted_constrained_regression(X, y, w, 
                                                   sum_constraint=model_prediction - baseline)
    
    RETURN shap_values, baseline


# Example output
shap_vals, base = shap_values(loan_model, applicant, training_data)

# Interpretation:
# Base prediction: 0.60 (average approval probability)
# 
# SHAP values:
#   Income: +0.25   (income increases approval by 0.25)
#   Debt:   -0.15   (debt decreases approval by 0.15)
#   Age:    +0.03   (age slightly increases approval)
#   History: +0.07  (history increases approval)
#
# Final prediction: 0.60 + 0.25 - 0.15 + 0.03 + 0.07 = 0.80

SHAP Visualization Types

1. Force Plot: Shows how features push prediction from base value

Starting from the base prediction (0.60), each feature either pushes the prediction up or down:

  • Income: +0.25 (pushes up)
  • History: +0.07 (pushes up)
  • Age: +0.03 (pushes up)
  • Debt: -0.15 (pushes down)
  • Final prediction: 0.80

2. Summary Plot: Global view of feature importance across all predictions

Shows the distribution of SHAP values for each feature across the dataset, revealing which features have the most impact overall.

3. Dependence Plot: How a feature's value affects its SHAP value

Shows the relationship between a feature's actual value (x-axis) and its SHAP value (y-axis), revealing non-linear relationships.

Tree SHAP: Fast Exact Computation

For tree-based models (Random Forest, XGBoost, LightGBM), Tree SHAP computes exact Shapley values in polynomial time:

AlgorithmComplexityNotes
Kernel SHAPO(2^n)Exponential in features
Tree SHAPO(TLD²)Polynomial - much faster

Where: T = Number of trees, L = Maximum leaves, D = Maximum depth

Example: For a model with 100 trees, depth 10, and 20 features:

  • Kernel SHAP: ~1 million evaluations
  • Tree SHAP: ~100,000 evaluations (10x faster)

LIME vs SHAP: When to Use Which

Comparison Table

AspectLIMESHAP
Theoretical foundationIntuitive, ad-hocGame theory (Shapley values)
ConsistencyCan vary with random seedDeterministic (with same background)
AdditivityFeatures don't sum to predictionFeatures sum to prediction - baseline
ComputationFast (single regression)Slower (many evaluations)
Global explanationsNot built-inSummary plots, interactions
Model-specific speedupsNoYes (Tree SHAP, Deep SHAP)
InterpretabilityVery intuitiveRequires understanding Shapley
ImplementationSimpleMore complex

Decision Framework

Use LIME when:

  • You need quick, intuitive explanations
  • Exact consistency isn't critical
  • You're explaining to non-technical stakeholders
  • You're working with text or images
  • You're prototyping or exploring

Use SHAP when:

  • You need theoretically grounded explanations
  • Consistency across explanations matters
  • You want global + local explanations
  • You're working with tree-based models (Tree SHAP is fast)
  • You need to satisfy regulatory requirements
  • You need features to sum to prediction

Use both when:

  • You want to validate explanations
  • Different stakeholders need different views
  • You're building a comprehensive explanation system

Practical Recommendations

ScenarioRecommendationReason
Explaining loan decisions to applicantsLIMESimple, intuitive explanations for non-technical users
Auditing model fairness for regulatorsSHAPConsistent, additive, theoretically grounded
Debugging XGBoost model predictionsTree SHAPFast, exact, shows feature interactions
Explaining image classification to researchersBoth LIME + Gradient-basedDifferent methods highlight different patterns
Production system with latency constraintsLIME or precomputed SHAPLIME is faster; SHAP can be cached

Implementation Guide

Setting Up LIME

PSEUDO-CODE: LIME Setup and Usage

# Installation (conceptual)
# pip install lime

# For tabular data
class LIMETabularExplainer:
    def __init__(self, training_data, feature_names, class_names):
        """
        Initialize LIME explainer with training data context
        """
        self.training_data = training_data
        self.feature_names = feature_names
        self.class_names = class_names
        
        # Compute statistics for perturbation
        self.means = compute_means(training_data)
        self.stds = compute_stds(training_data)
        self.feature_types = infer_types(training_data)
    
    def explain_instance(self, instance, predict_fn, num_features=10):
        """
        Generate explanation for a single instance
        """
        # Generate perturbed samples
        samples = self.generate_perturbations(instance, n=5000)
        
        # Get predictions for samples
        predictions = predict_fn(samples)
        
        # Compute sample weights
        weights = self.compute_weights(instance, samples)
        
        # Fit local linear model
        explanation = self.fit_local_model(samples, predictions, weights)
        
        # Return top features
        RETURN explanation.top_features(num_features)


# Usage example
explainer = LIMETabularExplainer(
    training_data=X_train,
    feature_names=['age', 'income', 'debt', 'history'],
    class_names=['denied', 'approved']
)

explanation = explainer.explain_instance(
    instance=applicant,
    predict_fn=model.predict_proba,
    num_features=4
)

# Display
print("Prediction: Approved (0.80)")
print("Explanation:")
FOR feature, weight in explanation:
    direction = "↑" if weight > 0 else "↓"
    print(f"  {feature}: {direction} {abs(weight):.3f}")

# Output:
# Prediction: Approved (0.80)
# Explanation:
#   income: ↑ 0.342
#   debt: ↓ 0.256
#   history: ↑ 0.124
#   age: ↑ 0.045

Setting Up SHAP

PSEUDO-CODE: SHAP Setup and Usage

# Installation (conceptual)
# pip install shap

# For any model (Kernel SHAP)
class KernelSHAPExplainer:
    def __init__(self, predict_fn, background_data):
        """
        Initialize SHAP explainer with background data
        
        background_data: Reference dataset (typically 100-1000 samples)
        """
        self.predict_fn = predict_fn
        self.background = background_data
        self.expected_value = mean(predict_fn(background_data))
    
    def explain(self, instances):
        """
        Compute SHAP values for instances
        """
        shap_values = []
        
        FOR instance in instances:
            values = self.compute_shap_values(instance)
            shap_values.append(values)
        
        RETURN array(shap_values)
    
    def compute_shap_values(self, instance):
        # Implements Kernel SHAP algorithm
        # (See pseudo-code in SHAP section above)
        ...


# For tree models (Tree SHAP - much faster)
class TreeSHAPExplainer:
    def __init__(self, tree_model):
        """
        Initialize with tree-based model
        Supports: XGBoost, LightGBM, RandomForest, etc.
        """
        self.model = tree_model
        self.expected_value = self.compute_base_value()
    
    def explain(self, instances):
        """
        Compute exact SHAP values using tree structure
        """
        # Uses polynomial-time algorithm
        RETURN self.tree_shap_algorithm(instances)


# Usage example
explainer = TreeSHAPExplainer(xgboost_model)
shap_values = explainer.explain(X_test)

# Visualization
def plot_summary(shap_values, X_test, feature_names):
    """
    Create summary plot showing feature importance
    """
    # Sort features by mean absolute SHAP value
    importance = mean(abs(shap_values), axis=0)
    sorted_idx = argsort(importance)[::-1]
    
    FOR idx in sorted_idx[:10]:
        print(f"{feature_names[idx]}: {importance[idx]:.4f}")
        
        # Plot distribution of SHAP values for this feature
        plot_beeswarm(shap_values[:, idx], X_test[:, idx])


# Force plot for single prediction
def plot_force(shap_values, instance, expected_value, feature_names):
    """
    Show how features push prediction from base value
    """
    prediction = expected_value + sum(shap_values)
    
    print(f"Base value: {expected_value:.3f}")
    print(f"Prediction: {prediction:.3f}")
    print("\nFeature contributions:")
    
    FOR i, (name, value) in enumerate(zip(feature_names, shap_values)):
        IF abs(value) > 0.01:  # Only show significant features
            arrow = "→↑" if value > 0 else "→↓"
            print(f"  {name}: {arrow} {value:+.3f}")

Production Considerations

PRODUCTION CHECKLIST:

1. CACHING
   - Precompute SHAP values for common cases
   - Cache background data statistics
   - Store explainer objects between requests

2. LATENCY
   - LIME: ~100-500ms per explanation
   - Kernel SHAP: ~1-5s per explanation
   - Tree SHAP: ~10-50ms per explanation
   
   If latency matters, use Tree SHAP or precompute

3. MEMORY
   - Background data for SHAP: ~1000 samples typical
   - LIME training data statistics: ~10KB per feature
   
   Consider sampling for large datasets

4. CONSISTENCY
   - Set random seeds for reproducible LIME
   - Use consistent background data for SHAP
   - Document explanation methodology

5. MONITORING
   - Log explanation distributions over time
   - Alert on unexpected feature importance changes
   - Track explanation generation failures

Regulatory Requirements

EU AI Act Explainability Requirements

The EU AI Act (effective 2024-2026) mandates explainability for high-risk AI systems:

"It is often not possible to find out why an AI system has made a decision or prediction... So, it may become difficult to assess whether someone has been unfairly disadvantaged." — EU AI Act Recitals

Compliance Requirements

HIGH-RISK AI SYSTEMS MUST:

1. TRANSPARENCY
   - Provide clear information about AI use
   - Explain the logic involved in decision-making
   - Inform affected persons of their rights

2. DOCUMENTATION
   - Maintain logs of AI decisions
   - Document explanation methodology
   - Record feature importance for audits

3. HUMAN OVERSIGHT
   - Enable human understanding of AI outputs
   - Allow intervention in automated decisions
   - Provide meaningful human review

4. AFFECTED PERSON RIGHTS
   - Right to explanation for automated decisions
   - Right to human review
   - Right to contest AI decisions

LIME/SHAP for Compliance

COMPLIANCE STRATEGY:

FOR each high-risk AI decision:
    
    1. GENERATE EXPLANATION
       explanation = shap_explainer.explain(instance)
       # or
       explanation = lime_explainer.explain_instance(instance)
    
    2. LOG FOR AUDIT
       audit_log.record({
           "timestamp": now(),
           "decision_id": unique_id,
           "prediction": model_output,
           "explanation": explanation,
           "top_features": explanation.top(5),
           "model_version": model.version
       })
    
    3. PRESENT TO USER (if requested)
       user_explanation = format_for_humans(explanation)
       
       # Example output:
       # "Your loan application was assessed based primarily on:
       #  - Your income level (positive factor)
       #  - Your current debt (negative factor)
       #  - Your credit history length (positive factor)
       #  
       #  You may request human review of this decision."
    
    4. ENABLE CONTESTATION
       IF user.requests_review():
           route_to_human_reviewer(decision_id, explanation)

NIST AI Risk Management Framework

The NIST AI RMF provides guidance on interpretability:

NIST AI RMF Functions:

FunctionInterpretability Requirements
GOVERNEstablish interpretability requirements by risk level
MAPIdentify where explanations are needed; Define stakeholders who need explanations
MEASUREEvaluate explanation quality and consistency; Test explanation faithfulness to model
MANAGEImplement explanation systems; Monitor explanation drift; Update methodologies as needed

Advanced Techniques

SHAP Interaction Values

Beyond individual feature attributions, SHAP can measure feature interactions:

SHAP INTERACTIONS:

Standard SHAP: How much does feature i contribute?
Interaction SHAP: How much do features i and j contribute together,
                  beyond their individual contributions?

PSEUDO-CODE:
interaction_values = shap_explainer.shap_interaction_values(X)

# interaction_values[sample, feature_i, feature_j]
# Diagonal: main effects
# Off-diagonal: interaction effects

EXAMPLE:
# Age alone: +0.05
# Income alone: +0.20
# Age × Income interaction: +0.10
# 
# Interpretation: High income matters more for younger applicants

Anchors: Rule-Based Explanations

Anchors complement LIME/SHAP with rule-based explanations:

LIME EXPLANATION: 
"Income contributed +0.34 to approval probability"

ANCHOR EXPLANATION:
"IF income > 60000 AND debt < 15000 THEN approved
 (with 95% precision)"

PSEUDO-CODE:
def find_anchor(model, instance, precision_threshold=0.95):
    """
    Find minimal rule that guarantees prediction
    """
    rules = []
    current_precision = 0
    
    WHILE current_precision < precision_threshold:
        # Add most informative rule
        best_rule = find_best_rule(instance, rules, model)
        rules.append(best_rule)
        
        # Measure precision of current rule set
        current_precision = evaluate_precision(rules, model)
    
    RETURN rules

Counterfactual Explanations

What minimal change would flip the prediction?

COUNTERFACTUAL EXPLANATION:

Original: Loan DENIED
"If your income were $65,000 instead of $50,000,
 your loan would be APPROVED"

PSEUDO-CODE:
def find_counterfactual(model, instance, target_class):
    """
    Find minimal perturbation that changes prediction
    """
    # Start from original instance
    counterfactual = instance.copy()
    
    # Optimize to flip prediction with minimal change
    FOR iteration in range(max_iterations):
        # Compute gradient toward target class
        gradient = compute_gradient(model, counterfactual, target_class)
        
        # Update counterfactual
        counterfactual += learning_rate * gradient
        
        # Encourage minimal changes
        counterfactual = project_to_valid_range(counterfactual)
        
        IF model.predict(counterfactual) == target_class:
            break
    
    # Report changes
    changes = []
    FOR i, (orig, new) in enumerate(zip(instance, counterfactual)):
        IF abs(orig - new) > threshold:
            changes.append((feature_names[i], orig, new))
    
    RETURN changes

Common Pitfalls

Pitfall 1: Treating Explanations as Ground Truth

PROBLEM:
Explanations are approximations, not the actual model logic.
LIME and SHAP can disagree, and both can be wrong.

MITIGATION:
- Use multiple explanation methods
- Validate explanations with domain experts
- Test explanation faithfulness (do features actually matter?)

Pitfall 2: Ignoring Feature Correlation

PROBLEM:
When features are correlated, attribution can be distributed
arbitrarily between them.

EXAMPLE:
- height and weight are correlated
- SHAP might attribute importance to one arbitrarily
- The "true" importance is shared

MITIGATION:
- Use SHAP interaction values
- Group correlated features
- Be cautious interpreting individual correlated features

Pitfall 3: Wrong Background Data (SHAP)

PROBLEM:
SHAP explanations depend on background (reference) data.
Wrong background = wrong explanations.

BAD:
background = entire_training_set  # May include irrelevant subgroups

GOOD:
background = relevant_subpopulation  # E.g., same demographic

MITIGATION:
- Choose background data carefully
- Consider multiple reference points
- Document background data choice

Pitfall 4: Instability (LIME)

PROBLEM:
LIME explanations can vary with random seed.
Running twice may give different answers.

MITIGATION:
- Set random seed for reproducibility
- Run multiple times and average
- Use SHAP if consistency is critical
- Report confidence intervals

Pitfall 5: Computational Cost (SHAP)

PROBLEM:
Kernel SHAP is expensive: O(2^n) for n features

SYMPTOMS:
- Explanation takes minutes
- Memory errors for large datasets
- Production latency issues

MITIGATION:
- Use Tree SHAP for tree models (O(TLD²))
- Limit number of background samples
- Precompute explanations offline
- Sample features for high-dimensional data

FAQ

Q: Are LIME and SHAP faithful to the model? A: Not perfectly. LIME is a local approximation that may miss non-linearities. SHAP is theoretically consistent but can give misleading attributions for correlated features. Always validate with domain knowledge and multiple methods.

Q: Can I use LIME/SHAP for deep learning? A: Yes, but with caveats. Kernel SHAP works with any model but is slow. Deep SHAP uses gradient-based approximations. For images, consider saliency maps or integrated gradients in addition to LIME.

Q: How many background samples do I need for SHAP? A: Typically 100-1000 samples. More is better for accuracy but slower. Diminishing returns after ~1000. Ensure background is representative of your data distribution.

Q: Do I need explanations for every prediction? A: Not necessarily. Consider: (1) High-stakes decisions need explanations, (2) Regulatory requirements may mandate logging, (3) On-demand explanations may suffice for low-risk cases.

Q: How do I explain to non-technical users? A: Focus on: (1) What factors mattered most, (2) Which way each factor pushed the decision, (3) What changes might lead to different outcomes. Avoid technical jargon like "SHAP values."

Q: Can explanations be gamed or manipulated? A: Yes. Adversarial examples exist for explanations. Someone could create inputs that give misleading explanations. Monitor for unusual patterns and use multiple explanation methods.


Conclusion

Interpretability is essential for responsible AI deployment. LIME and SHAP provide complementary approaches to understanding model predictions, each with distinct strengths.

Key Takeaways:

  1. LIME is fast and intuitive — Best for quick local explanations and non-technical stakeholders
  2. SHAP is rigorous and consistent — Best for compliance, debugging, and theoretical soundness
  3. Use both when possible — Different methods highlight different patterns
  4. Explanations are approximations — Validate with domain knowledge
  5. Regulatory requirements are growing — Plan for explainability from the start

As AI systems become more prevalent in high-stakes decisions, interpretability moves from "nice to have" to essential requirement. LIME and SHAP are foundational tools in this landscape.


📚 Responsible AI Series

PartArticleStatus
1Understanding AI Alignment
2RLHF & Constitutional AI
3AI Interpretability with LIME & SHAP (You are here)
4Automated Red Teaming with PyRITComing Soon
5AI Runtime Governance & Circuit BreakersComing Soon

← Previous: RLHF & Constitutional AI
Next →: Automated Red Teaming with PyRIT


🚀 Ready to Master Responsible AI?

Our training modules cover practical implementation of AI safety techniques, from prompt engineering to production governance.

📚 Explore Our Training Modules | Start Module 0


References:


Last Updated: January 29, 2026
Part 3 of the Responsible AI Engineering Series

GO DEEPER

Module 0 — Prompting Fundamentals

Build your first effective prompts from scratch with hands-on exercises.