EM algorithm [1/2]

Let's say we want to estimate parameters θ based on data X according to some statistical model f. But assume that actually
f is f(X, Z; θ)
i.e., there are some unobserved variables Z which influence X.

The Expectation-Maximization (EM) algorithm roughly repeats the following steps:

1. (E-step) Based on the current estimate of θ compute the expected value of Z.
2. (M-step) Obtain a new estimate of θ based on f where Z is replaced with the expected value from step 1.

EM algorithm [2/2]

Here is the formal general statement.

Let \(X\) be the observed data, and let \(Z\) be the unobserved data. Let \(l(\theta; X, Z)\) be the log-likelihood of the complete data \((X, Z)\) where \(\theta\) is the parameter vector of interest.

With initial guess \(\hat{\theta}^0\) repeat until convergence:

1. E-step: Compute
\[Q(\theta, \hat{\theta}^j) = E(l(\theta; X, Z) | X, \hat{\theta}^j).\]
2. M-step:
\[\hat{\theta}^{j+1} = \arg\!\max_\theta Q(\theta, \hat{\theta}^j).\]

Show thread
Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!