Understanding SHAP: A Complete Guide

What is SHAP?

SHAP (SHapley Additive exPlanations) explains machine learning predictions by decomposing them into contributions from each feature. It's based on game theory - specifically Shapley values from cooperative game theory.

Step-by-Step Understanding

Step 1: The Basic Question

Given a model prediction, SHAP answers: "How much did each feature contribute to this specific prediction?"

Step 2: The Decomposition

Every prediction is broken down as:

f(x) = baseline + A_contribution + B_contribution + C_contribution

Where:

f(x) = actual model prediction
baseline = same for all predictions
A_contribution = how much feature A helps/hurts
B_contribution = how much feature B helps/hurts
C_contribution = how much feature C helps/hurts

Step 3: The Baseline

The baseline is the arithmetic mean of all training predictions:

baseline = (1/n) × sum of all model predictions on training data

This is what the model predicts "on average".

🔑 Key Insight: The baseline is the same for all explanations. Only the feature contributions change from prediction to prediction.

Step 4: How to Get SHAP Values for Each Feature

To get the SHAP value for feature A:

A alone: f({A}) - f({}) = "How much does A help vs nothing?"
A with B: f({A,B}) - f({B}) = "How much does A help when B is present?"
A with C: f({A,C}) - f({C}) = "How much does A help when C is present?"
A with B,C: f({A,B,C}) - f({B,C}) = "How much does A help when B,C are present?"
Average these 4 numbers = SHAP value for A

To get the SHAP value for feature B:

B alone: f({B}) - f({}) = "How much does B help vs nothing?"
B with A: f({A,B}) - f({A}) = "How much does B help when A is present?"
B with C: f({B,C}) - f({C}) = "How much does B help when C is present?"
B with A,C: f({A,B,C}) - f({A,C}) = "How much does B help when A,C are present?"
Average these 4 numbers = SHAP value for B

To get the SHAP value for feature C:

C alone: f({C}) - f({}) = "How much does C help vs nothing?"
C with A: f({A,C}) - f({A}) = "How much does C help when A is present?"
C with B: f({B,C}) - f({B}) = "How much does C help when B is present?"
C with A,B: f({A,B,C}) - f({A,B}) = "How much does C help when A,B are present?"
Average these 4 numbers = SHAP value for C

Linear Models: The Simple Case

Why Linear Models Are Special

For linear models, we don't need the complex averaging process! There's a direct formula.

Linear Model:

f(x) = coeff_A × A + coeff_B × B + coeff_C × C + intercept

SHAP Decomposition:

Baseline: coeff_A × mean(A) + coeff_B × mean(B) + coeff_C × mean(C) + intercept
A contribution: coeff_A × (A - mean(A))
B contribution: coeff_B × (B - mean(B))
C contribution: coeff_C × (C - mean(C))

How We Get the Linear Formula: Step by Step

Remember the general SHAP process? For feature A, we need to calculate:

f({A}) - f({})
f({A,B}) - f({B})
f({A,C}) - f({C})
f({A,B,C}) - f({B,C})
Average these 4 contributions

Let's work this out for a linear model!

Model: f(A,B,C) = coeff_A × A + coeff_B × B + coeff_C × C + intercept

Step 1: Calculate each contribution

Contribution 1: f({A}) - f({})

f({A}) = coeff_A × A + coeff_B × mean(B) + coeff_C × mean(C) + intercept
f({}) = coeff_A × mean(A) + coeff_B × mean(B) + coeff_C × mean(C) + intercept
Difference = coeff_A × A - coeff_A × mean(A) = coeff_A × (A - mean(A))

Contribution 2: f({A,B}) - f({B})

f({A,B}) = coeff_A × A + coeff_B × B + coeff_C × mean(C) + intercept
f({B}) = coeff_A × mean(A) + coeff_B × B + coeff_C × mean(C) + intercept
Difference = coeff_A × A - coeff_A × mean(A) = coeff_A × (A - mean(A))

Contribution 3: f({A,C}) - f({C})

f({A,C}) = coeff_A × A + coeff_B × mean(B) + coeff_C × C + intercept
f({C}) = coeff_A × mean(A) + coeff_B × mean(B) + coeff_C × C + intercept
Difference = coeff_A × A - coeff_A × mean(A) = coeff_A × (A - mean(A))

Contribution 4: f({A,B,C}) - f({B,C})

f({A,B,C}) = coeff_A × A + coeff_B × B + coeff_C × C + intercept
f({B,C}) = coeff_A × mean(A) + coeff_B × B + coeff_C × C + intercept
Difference = coeff_A × A - coeff_A × mean(A) = coeff_A × (A - mean(A))

Step 2: Average the contributions

A_contribution = (1/4) × [coeff_A × (A - mean(A)) + coeff_A × (A - mean(A)) + coeff_A × (A - mean(A)) + coeff_A × (A - mean(A))]
= (1/4) × [4 × coeff_A × (A - mean(A))]
= coeff_A × (A - mean(A))

🎯 Amazing! All 4 contributions are identical for linear models, so the average is just coeff_A × (A - mean(A))!

Why All 4 Contributions Are The Same

In every case, we're computing:

"Model with A" minus "Model without A"
The only difference is A vs mean(A)
Everything else (B, C terms) cancels out perfectly!
So we always get: coeff_A × (A - mean(A))

This is why linear models are special: Feature A's contribution doesn't depend on what other features are present!

Step-by-Step Linear Example

Model: Price = 50 × bedrooms + 100 × bathrooms + 200 × garage + 300

Training Data Averages:

mean(bedrooms) = 2.5
mean(bathrooms) = 1.8
mean(garage) = 0.6

House to Explain:

bedrooms = 3
bathrooms = 2
garage = 1

Model Prediction:

Price = 50×3 + 100×2 + 200×1 + 300 = 150 + 200 + 200 + 300 = 850

SHAP Breakdown:

Baseline: 50×2.5 + 100×1.8 + 200×0.6 + 300 = 125 + 180 + 120 + 300 = 725
Bedrooms contribution: 50×(3-2.5) = 50×0.5 = 25
Bathrooms contribution: 100×(2-1.8) = 100×0.2 = 20
Garage contribution: 200×(1-0.6) = 200×0.4 = 80

Verification:

725 + 25 + 20 + 80 = 850 ✓

Interpretation:

"The model predicts 725 for an average house"
"Having 3 bedrooms instead of 2.5 adds 25 to the price"
"Having 2 bathrooms instead of 1.8 adds 20 to the price"
"Having a garage instead of 0.6 probability adds 80 to the price"
"Total: 725 + 25 + 20 + 80 = 850"

🔑 Key Insight: For linear models, SHAP is just coefficient × (actual_value - average_value). The coefficient tells you the rate of change, and the distance from average tells you how much to apply it!

When computing f({A}), what do we do with B and C?

We have 3 options:

Option 1: Set B=0, C=0
Option 2: Set B=mean(B), C=mean(C)
Option 3: Average over all possible B,C values from training data

CRITICAL: Whatever option you choose, use it consistently for ALL calculations!

Application to Different Models

Linear Models

Model: f(x) = coeff_A × A + coeff_B × B + intercept

SHAP Calculation (Simple!):

Baseline: coeff_A × mean(A) + coeff_B × mean(B) + intercept
A contribution: coeff_A × (A - mean(A))
B contribution: coeff_B × (B - mean(B))

Example:

Model: Price = 3×bedrooms + 50×sqft + 100

Training averages: bedrooms=2.5, sqft=15

House: 3 bedrooms, 20 sqft

Prediction: 3×3 + 50×20 + 100 = 1109

SHAP breakdown:

Baseline: 3×2.5 + 50×15 + 100 = 857.5
Bedrooms contribution: 3×(3-2.5) = 1.5
Sqft contribution: 50×(20-15) = 250
Check: 857.5 + 1.5 + 250 = 1109 ✓

Nonlinear Models

Model: Any complex model (trees, neural networks, etc.)

SHAP Calculation (Complex!):

Use the step-by-step process from Step 4 above, with approximation methods:

TreeSHAP: Fast and exact for tree-based models
KernelSHAP: Sampling approximation for any model
DeepSHAP: For neural networks

Example with Random Forest:

Model: Complex ensemble of decision trees

House: 3 bedrooms, 20 sqft

Prediction: $1,200 (from complex tree logic)

SHAP breakdown:

Baseline: $950 (average of all training predictions)
Bedrooms contribution: +$180 (calculated via TreeSHAP)
Sqft contribution: +$70 (calculated via TreeSHAP)
Check: $950 + $180 + $70 = $1,200 ✓

Key Differences Between Linear and Nonlinear

Linear Models

✅ Simple formula: A_contribution = coefficient × (A - mean(A))
✅ Exact calculation: No approximation needed
✅ Fast: Instant computation
✅ Interpretable: Direct relationship to model coefficients

Nonlinear Models

⚙️ Complex calculation: Need to try all feature combinations
⚙️ Approximation: Often uses sampling or other tricks
⚙️ Slower: Can be computationally expensive
✅ Flexible: Works with any model type

The Universal SHAP Process

For ANY model type:

Train your model (linear, tree, neural network, etc.)
Calculate baseline: Average all training predictions
For each data point to explain:
- Calculate A_contribution using the method above
- Calculate B_contribution using the method above
- Calculate C_contribution using the method above
Verify: baseline + A_contribution + B_contribution + C_contribution = actual prediction

🎯 Bottom Line: SHAP tells you exactly how much each feature A, B, C contributed to each prediction!

What SHAP Values Tell You

What you can learn:

✅ Which feature matters most: Is A_contribution bigger than B_contribution?
✅ Direction of impact: Is A_contribution positive (helps) or negative (hurts)?
✅ Exact magnitude: A_contribution = +$500 means feature A adds $500 to prediction
✅ Feature ranking: Sort features by absolute contribution size

What you cannot learn:

❌ Causation: SHAP shows correlation, not causation
❌ Global importance: These values are specific to this one prediction
❌ Feature interactions: SHAP assumes features work independently

🔬 Remember: SHAP gives you baseline + A_contribution + B_contribution + C_contribution = prediction. One baseline + one value per feature per data point!