📋 Overview
Goal: Detect different "regimes" (states) in a time series where each regime has different statistical properties.
- HIGH regime: Mean accuracy ≈ 0.83
- MED regime: Mean accuracy ≈ 0.65
- LOW regime: Mean accuracy ≈ 0.46
The challenge: We don't observe which regime each time point belongs to - it's "hidden"! We only see the accuracy values.
❓ The Problem Setup
What We Have:
- Time series of values:
[0.82, 0.85, 0.83, 0.45, 0.48, 0.44, 0.81, 0.84, ...] - We want to find n regimes (e.g., n=3 for HIGH, MED, LOW)
What We Need to Learn:
| Parameter | Meaning |
|---|---|
| μ (mu) | Mean value for each regime |
| σ² (sigma²) | Variance for each regime (how spread out) |
| Transitions | Probability of switching from one regime to another |
🔄 The EM Algorithm (How It Works)
(Random)
(Calculate Probs)
(Update Params)
(Until Converge)
Step 1: Initialization
1 Start with random guesses for all parameters:
- Random μ for each regime (e.g., μ_HIGH = 0.75, μ_MED = 0.60, μ_LOW = 0.50)
- Random σ² for each regime
- Random transition probabilities
Step 2: E-Step (Expectation)
2 Calculate probabilities that each value belongs to each regime:
- P(0.82 | HIGH with μ=0.80, σ²=0.01) = 3.90 (close to mean → high!)
- P(0.82 | MED with μ=0.65, σ²=0.02) = 1.38 (medium distance)
- P(0.82 | LOW with μ=0.45, σ²=0.01) = 0.004 (far from mean → low!)
- P(HIGH | 0.82) = 3.90 / 5.28 = 74%
- P(MED | 0.82) = 1.38 / 5.28 = 26%
- P(LOW | 0.82) = 0.004 / 5.28 = 0.1%
Do this for every value in the time series!
Step 3: M-Step (Maximization)
3 Update parameters using the probabilities from E-step:
Time 0 (0.82): 40% HIGH → contributes 0.82 × 0.40 = 0.328
Time 1 (0.85): 50% HIGH → contributes 0.85 × 0.50 = 0.425
Time 2 (0.45): 10% HIGH → contributes 0.45 × 0.10 = 0.045
Time 3 (0.48): 20% HIGH → contributes 0.48 × 0.20 = 0.096
Time 4 (0.81): 60% HIGH → contributes 0.81 × 0.60 = 0.486
μ_HIGH = (0.328 + 0.425 + 0.045 + 0.096 + 0.486) / (0.40 + 0.50 + 0.10 + 0.20 + 0.60)
= 1.38 / 1.80 = 0.767
Similarly update σ² and transition probabilities.
Step 4: Repeat Until Convergence
4 Go back to E-step with the new parameters and repeat:
- Early iterations: Big changes in probabilities
- Later iterations: Smaller and smaller changes
- Convergence: Probabilities stabilize (barely change)
- Final stable μ, σ² for each regime
- Final transition probabilities
- Sharp probability assignments like: Time 0 is 98% HIGH, Time 3 is 95% LOW
📊 Model Selection with BIC
How do we know if 3 regimes is better than 2 or 4? Use BIC (Bayesian Information Criterion)!
Two Parts:
| Part | What It Measures | Effect |
|---|---|---|
| -2 × log_likelihood | How well model fits data | Lower = better fit (good!) |
| n_params × log(n_samples) | Model complexity | More params = higher penalty (bad!) |
- n=2: BIC=150 (underfits, too simple)
- n=3: BIC=124 ✅ BEST! (good balance)
- n=4: BIC=135 (overfits, too complex)
- n=5: BIC=148 (definitely overfitting)
✅ Validation Checks
Even with good BIC, the solution might be garbage! Check:
Check 1: Too Many Regime Switches?
If regimes switch more than 10% of the time → INVALID
Check 2: Are Regimes Distinct?
If two regime means are closer than 0.05 → INVALID
These are basically the same regime!
💻 Code Mapping
Where Each Step Happens:
🔹 Initialization (Random Start)
🔹 EM Loop (E-step + M-step repeated)
🔹 Calculate BIC
🔹 Detect Regimes (After Fitting)
🎯 Summary
Time Series
Random Init
E-step + M-step
Check switches & distinctness
Lowest BIC (valid)
- HMM learns hidden regime structure from data
- EM algorithm alternates between E-step (calculate probs) and M-step (update params)
- BIC helps choose the right number of regimes
- Validation ensures solutions are meaningful, not garbage
- Final model can classify new data into regimes