Understanding SHAP: A Complete Guide
What is SHAP?
SHAP (SHapley Additive exPlanations) explains machine learning predictions by decomposing them into contributions from each feature. It's based on game theory - specifically Shapley values from cooperative game theory.
Step-by-Step Understanding
Step 1: The Basic Question
Given a model prediction, SHAP answers: "How much did each feature contribute to this specific prediction?"
Step 2: The Decomposition
Every prediction is broken down as:
f(x) = baseline + A_contribution + B_contribution + C_contribution
Where:
- f(x) = actual model prediction
- baseline = same for all predictions
- A_contribution = how much feature A helps/hurts
- B_contribution = how much feature B helps/hurts
- C_contribution = how much feature C helps/hurts
Step 3: The Baseline
The baseline is the arithmetic mean of all training predictions:
baseline = (1/n) Γ sum of all model predictions on training data
This is what the model predicts "on average".
π Key Insight: The baseline is the same for all explanations. Only the feature contributions change from prediction to prediction.
Step 4: How to Get SHAP Values for Each Feature
To get the SHAP value for feature A:
- A alone: f({A}) - f({}) = "How much does A help vs nothing?"
- A with B: f({A,B}) - f({B}) = "How much does A help when B is present?"
- A with C: f({A,C}) - f({C}) = "How much does A help when C is present?"
- A with B,C: f({A,B,C}) - f({B,C}) = "How much does A help when B,C are present?"
- Average these 4 numbers = SHAP value for A
To get the SHAP value for feature B:
- B alone: f({B}) - f({}) = "How much does B help vs nothing?"
- B with A: f({A,B}) - f({A}) = "How much does B help when A is present?"
- B with C: f({B,C}) - f({C}) = "How much does B help when C is present?"
- B with A,C: f({A,B,C}) - f({A,C}) = "How much does B help when A,C are present?"
- Average these 4 numbers = SHAP value for B
To get the SHAP value for feature C:
- C alone: f({C}) - f({}) = "How much does C help vs nothing?"
- C with A: f({A,C}) - f({A}) = "How much does C help when A is present?"
- C with B: f({B,C}) - f({B}) = "How much does C help when B is present?"
- C with A,B: f({A,B,C}) - f({A,B}) = "How much does C help when A,B are present?"
- Average these 4 numbers = SHAP value for C
Linear Models: The Simple Case
Why Linear Models Are Special
For linear models, we don't need the complex averaging process! There's a direct formula.
Linear Model:
f(x) = coeff_A Γ A + coeff_B Γ B + coeff_C Γ C + intercept
SHAP Decomposition:
- Baseline: coeff_A Γ mean(A) + coeff_B Γ mean(B) + coeff_C Γ mean(C) + intercept
- A contribution: coeff_A Γ (A - mean(A))
- B contribution: coeff_B Γ (B - mean(B))
- C contribution: coeff_C Γ (C - mean(C))
How We Get the Linear Formula: Step by Step
Remember the general SHAP process? For feature A, we need to calculate:
- f({A}) - f({})
- f({A,B}) - f({B})
- f({A,C}) - f({C})
- f({A,B,C}) - f({B,C})
- Average these 4 contributions
Let's work this out for a linear model!
Model: f(A,B,C) = coeff_A Γ A + coeff_B Γ B + coeff_C Γ C + intercept
Step 1: Calculate each contribution
Contribution 1: f({A}) - f({})
- f({A}) = coeff_A Γ A + coeff_B Γ mean(B) + coeff_C Γ mean(C) + intercept
- f({}) = coeff_A Γ mean(A) + coeff_B Γ mean(B) + coeff_C Γ mean(C) + intercept
- Difference = coeff_A Γ A - coeff_A Γ mean(A) = coeff_A Γ (A - mean(A))
Contribution 2: f({A,B}) - f({B})
- f({A,B}) = coeff_A Γ A + coeff_B Γ B + coeff_C Γ mean(C) + intercept
- f({B}) = coeff_A Γ mean(A) + coeff_B Γ B + coeff_C Γ mean(C) + intercept
- Difference = coeff_A Γ A - coeff_A Γ mean(A) = coeff_A Γ (A - mean(A))
Contribution 3: f({A,C}) - f({C})
- f({A,C}) = coeff_A Γ A + coeff_B Γ mean(B) + coeff_C Γ C + intercept
- f({C}) = coeff_A Γ mean(A) + coeff_B Γ mean(B) + coeff_C Γ C + intercept
- Difference = coeff_A Γ A - coeff_A Γ mean(A) = coeff_A Γ (A - mean(A))
Contribution 4: f({A,B,C}) - f({B,C})
- f({A,B,C}) = coeff_A Γ A + coeff_B Γ B + coeff_C Γ C + intercept
- f({B,C}) = coeff_A Γ mean(A) + coeff_B Γ B + coeff_C Γ C + intercept
- Difference = coeff_A Γ A - coeff_A Γ mean(A) = coeff_A Γ (A - mean(A))
Step 2: Average the contributions
A_contribution = (1/4) Γ [coeff_A Γ (A - mean(A)) + coeff_A Γ (A - mean(A)) + coeff_A Γ (A - mean(A)) + coeff_A Γ (A - mean(A))]
= (1/4) Γ [4 Γ coeff_A Γ (A - mean(A))]
= coeff_A Γ (A - mean(A))
π― Amazing! All 4 contributions are identical for linear models, so the average is just coeff_A Γ (A - mean(A))!
Why All 4 Contributions Are The Same
In every case, we're computing:
- "Model with A" minus "Model without A"
- The only difference is A vs mean(A)
- Everything else (B, C terms) cancels out perfectly!
- So we always get: coeff_A Γ (A - mean(A))
This is why linear models are special: Feature A's contribution doesn't depend on what other features are present!
Step-by-Step Linear Example
Model: Price = 50 Γ bedrooms + 100 Γ bathrooms + 200 Γ garage + 300
Training Data Averages:
- mean(bedrooms) = 2.5
- mean(bathrooms) = 1.8
- mean(garage) = 0.6
House to Explain:
- bedrooms = 3
- bathrooms = 2
- garage = 1
Model Prediction:
Price = 50Γ3 + 100Γ2 + 200Γ1 + 300 = 150 + 200 + 200 + 300 = 850
SHAP Breakdown:
- Baseline: 50Γ2.5 + 100Γ1.8 + 200Γ0.6 + 300 = 125 + 180 + 120 + 300 = 725
- Bedrooms contribution: 50Γ(3-2.5) = 50Γ0.5 = 25
- Bathrooms contribution: 100Γ(2-1.8) = 100Γ0.2 = 20
- Garage contribution: 200Γ(1-0.6) = 200Γ0.4 = 80
Verification:
725 + 25 + 20 + 80 = 850 β
Interpretation:
- "The model predicts 725 for an average house"
- "Having 3 bedrooms instead of 2.5 adds 25 to the price"
- "Having 2 bathrooms instead of 1.8 adds 20 to the price"
- "Having a garage instead of 0.6 probability adds 80 to the price"
- "Total: 725 + 25 + 20 + 80 = 850"
π Key Insight: For linear models, SHAP is just coefficient Γ (actual_value - average_value). The coefficient tells you the rate of change, and the distance from average tells you how much to apply it!
When computing f({A}), what do we do with B and C?
We have 3 options:
- Option 1: Set B=0, C=0
- Option 2: Set B=mean(B), C=mean(C)
- Option 3: Average over all possible B,C values from training data
CRITICAL: Whatever option you choose, use it consistently for ALL calculations!
Application to Different Models
Linear Models
Model: f(x) = coeff_A Γ A + coeff_B Γ B + intercept
SHAP Calculation (Simple!):
- Baseline: coeff_A Γ mean(A) + coeff_B Γ mean(B) + intercept
- A contribution: coeff_A Γ (A - mean(A))
- B contribution: coeff_B Γ (B - mean(B))
Example:
Model: Price = 3Γbedrooms + 50Γsqft + 100
Training averages: bedrooms=2.5, sqft=15
House: 3 bedrooms, 20 sqft
Prediction: 3Γ3 + 50Γ20 + 100 = 1109
SHAP breakdown:
- Baseline: 3Γ2.5 + 50Γ15 + 100 = 857.5
- Bedrooms contribution: 3Γ(3-2.5) = 1.5
- Sqft contribution: 50Γ(20-15) = 250
- Check: 857.5 + 1.5 + 250 = 1109 β
Nonlinear Models
Model: Any complex model (trees, neural networks, etc.)
SHAP Calculation (Complex!):
Use the step-by-step process from Step 4 above, with approximation methods:
- TreeSHAP: Fast and exact for tree-based models
- KernelSHAP: Sampling approximation for any model
- DeepSHAP: For neural networks
Example with Random Forest:
Model: Complex ensemble of decision trees
House: 3 bedrooms, 20 sqft
Prediction: $1,200 (from complex tree logic)
SHAP breakdown:
- Baseline: $950 (average of all training predictions)
- Bedrooms contribution: +$180 (calculated via TreeSHAP)
- Sqft contribution: +$70 (calculated via TreeSHAP)
- Check: $950 + $180 + $70 = $1,200 β
Key Differences Between Linear and Nonlinear
Linear Models
- β
Simple formula: A_contribution = coefficient Γ (A - mean(A))
- β
Exact calculation: No approximation needed
- β
Fast: Instant computation
- β
Interpretable: Direct relationship to model coefficients
Nonlinear Models
- βοΈ Complex calculation: Need to try all feature combinations
- βοΈ Approximation: Often uses sampling or other tricks
- βοΈ Slower: Can be computationally expensive
- β
Flexible: Works with any model type
The Universal SHAP Process
For ANY model type:
- Train your model (linear, tree, neural network, etc.)
- Calculate baseline: Average all training predictions
- For each data point to explain:
- Calculate A_contribution using the method above
- Calculate B_contribution using the method above
- Calculate C_contribution using the method above
- Verify: baseline + A_contribution + B_contribution + C_contribution = actual prediction
π― Bottom Line: SHAP tells you exactly how much each feature A, B, C contributed to each prediction!
What SHAP Values Tell You
What you can learn:
- β
Which feature matters most: Is A_contribution bigger than B_contribution?
- β
Direction of impact: Is A_contribution positive (helps) or negative (hurts)?
- β
Exact magnitude: A_contribution = +$500 means feature A adds $500 to prediction
- β
Feature ranking: Sort features by absolute contribution size
What you cannot learn:
- β Causation: SHAP shows correlation, not causation
- β Global importance: These values are specific to this one prediction
- β Feature interactions: SHAP assumes features work independently
π¬ Remember: SHAP gives you baseline + A_contribution + B_contribution + C_contribution = prediction. One baseline + one value per feature per data point!