EM algorithm [1/2]

Let's say we want to estimate parameters θ based on data X according to some statistical model f. But assume that actually
f is f(X, Z; θ)
i.e., there are some unobserved variables Z which influence X.

The Expectation-Maximization (EM) algorithm roughly repeats the following steps:

1. (E-step) Based on the current estimate of θ compute the expected value of Z.
2. (M-step) Obtain a new estimate of θ based on f where Z is replaced with the expected value from step 1.

EM algorithm [2/2]

Here is the formal general statement.

Let $$X$$ be the observed data, and let $$Z$$ be the unobserved data. Let $$l(\theta; X, Z)$$ be the log-likelihood of the complete data $$(X, Z)$$ where $$\theta$$ is the parameter vector of interest.

With initial guess $$\hat{\theta}^0$$ repeat until convergence:

1. E-step: Compute
$Q(\theta, \hat{\theta}^j) = E(l(\theta; X, Z) | X, \hat{\theta}^j).$
2. M-step:
$\hat{\theta}^{j+1} = \arg\!\max_\theta Q(\theta, \hat{\theta}^j).$

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!