Seriously bad data in Google's GoEmotions dataset (58K reddit comments categorized by affect):, via

Opinions in the post and comments vary on why the categorization was so inaccurate, including lack of context, farming it out to poorly-paid workers in countries less likely to be familiar with the specific idioms used in the comments, or maybe just that it's a hard problem.

