EM algorithm [1/2]

Let's say we want to estimate parameters θ based on data X according to some statistical model f. But assume that actually

f is f(X, Z; θ)

i.e., there are some unobserved variables Z which influence X.

The Expectation-Maximization (EM) algorithm roughly repeats the following steps:

1. (E-step) Based on the current estimate of θ compute the expected value of Z.

2. (M-step) Obtain a new estimate of θ based on f where Z is replaced with the expected value from step 1.

0-fold cross-validation@0foldcv@mathstodon.xyzEM algorithm [2/2]

Here is the formal general statement.

Let \(X\) be the observed data, and let \(Z\) be the unobserved data. Let \(l(\theta; X, Z)\) be the log-likelihood of the complete data \((X, Z)\) where \(\theta\) is the parameter vector of interest.

With initial guess \(\hat{\theta}^0\) repeat until convergence:

1. E-step: Compute

\[Q(\theta, \hat{\theta}^j) = E(l(\theta; X, Z) | X, \hat{\theta}^j).\]

2. M-step:

\[\hat{\theta}^{j+1} = \arg\!\max_\theta Q(\theta, \hat{\theta}^j).\]