Lab 09: Forward Algorithm in HMMs
In this lab, we dive into the Forward Algorithm, a core component of probabilistic models—especially Hidden Markov Models (HMMs). The algorithm helps compute the probability of an observed sequence, given a model’s parameters. This is crucial in applications like speech recognition, bioinformatics, and NLP.
Introduction
Hidden Markov Models (HMMs) are statistical models where the system being modeled is assumed to be a Markov process with hidden states. The Forward Algorithm efficiently calculates the total probability of an observed sequence by summing over all possible hidden state sequences.
Problem Setup
Let:
- : number of hidden states
- : length of observation sequence
- : state transition probabilities
- : observation/emission probabilities
- : initial state distribution
- : sequence of observations
The goal is to compute:
This is efficiently done using dynamic programming.
Algorithm Overview
Steps:
-
Initialization
-
Recursion
-
Termination
Implementation
import numpy as np
def forward_algorithm(A, B, pi, observations):
N = A.shape[0] # Number of states
T = len(observations) # Length of observation sequence
alpha = np.zeros((T, N))
# Initialization
alpha[0, :] = pi * B[:, observations[0]]
# Recursion
for t in range(1, T):
for j in range(N):
alpha[t, j] = np.sum(alpha[t - 1] * A[:, j]) * B[j, observations[t]]
# Termination
probability = np.sum(alpha[T - 1, :])
return probability, alpha
Example Usage
Click to expand
# Define model parameters
A = np.array([[0.7, 0.3],
[0.4, 0.6]])
B = np.array([[0.1, 0.4, 0.5],
[0.6, 0.3, 0.1]])
pi = np.array([0.6, 0.4])
# Observations (coded as indices)
observations = [0, 1, 2] # e.g., observation symbols could be ['H', 'T', 'H']
# Run the forward algorithm
probability, alpha_matrix = forward_algorithm(A, B, pi, observations)
print("Probability of observing sequence:", probability)
print("Alpha matrix:\n", alpha_matrix)
Visualization
Click to expand
import seaborn as sns
import matplotlib.pyplot as plt
def visualize_alpha(alpha):
plt.figure(figsize=(10, 4))
sns.heatmap(alpha, annot=True, fmt=".4f", cmap="Blues")
plt.title("Forward Probabilities (Alpha Matrix)")
plt.xlabel("States")
plt.ylabel("Time Step")
plt.show()
visualize_alpha(alpha_matrix)
Time and Space Complexity
Step | Time Complexity | Space Complexity |
---|---|---|
Initialization | ||
Recursion | ||
Termination |
- : Number of states
- : Length of observation sequence
Applications
- Speech Recognition: Predicting spoken words from audio input.
- Gene Prediction: Modeling DNA sequences.
- POS Tagging: Inferring grammatical structures from sentence tokens.
- Time Series Analysis: Anomaly detection in sensor data.
Conclusion
The Forward Algorithm is a foundational method in probabilistic modeling for sequences. It elegantly leverages dynamic programming to solve a problem that would otherwise require exponential time. Mastering this algorithm is essential for understanding more advanced HMM tasks like training and decoding.
References
-
Rabiner, L. R.
A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proceedings of the IEEE.