@jt Even rereading "Attention is All You Need" feels like a trip through time now...
@dlzv That one is on my follow up list haha. I was focusing on Bayesian models because they seem easy to understand and implement without industrial amounts of data.
@jt I'm not very familiar with Bayesian models for NLP, do you have some pointers? For what it's worth, there are pretrained models that can be used with very few data now if you're interested. Of course understanding them deeply is another issue...
@jt There are plenty of pretrained models for French! I think it's probably one of the most well-studied languages: https://huggingface.co/models?language=fr&sort=downloads
Thanks for the refs!
@dlzv Thanks, I'm going to give them a try :)