Sigmoid and softmax learn an equivalent classifier.

Let $\sigma(wx + b) = 0.5$ be the decision boundary for a sigmoid classifier.

Then, for a softmax, $\exp(w_1 x + b_1) / [(\exp(w_1x + b_1) + \exp(w_2x + b_2)] = 0.5$ implies $\exp(w_1x + b_1) = \exp(w_2x + b_2)$ implies $w_1x + b_1 = w_2x + b_2$ implies $(w_1 - w_2)x + (b_1 - b_2) = 0.$

· · Web · · ·

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!