#ACMPrize
#2024ACMPrize
#ACMTuringAward
» #ReinforcementLearning
An Introduction
1998
standard reference...cited over 75,000
...
prominent example of #RL
#AlphaGo victory
over best human #Go players
2016 2017
....
recently has been the development of the chatbot #ChatGPT
...
large language model #LLM trained in two phases ...employs a technique called
reinforcement learning from human feedback #RLHF «
aka cheap labor unnamed in papers
https://awards.acm.org/about/2024-turing
2/2
