Matt Henderson: "that was with SGD+momentum. He…" - Mathstodon

Recent searches

Search options

Only available when logged in.

Matt Henderson @matthen2

watch a 2 layer neural network learn to separate two classes to the left and right

GIF

#MachineLearning #math #MathGIF

Matt Henderson @matthen2@mathstodon.xyz

that was with SGD+momentum. Here's what it looks like with the Adam optimizer

GIF

Nov 19, 2022, 07:57 PM··Web

21boosts·55favorites

Catherine Mountain @catherineMountain@mastodonapp.uk

@matthen2 that’s nice Matt.

Arseny Khakhalin @ampanmdagaba@sigmoid.social

@matthen2 That’s supercool! Makes you appreciate Adam haha!

Matt Henderson @matthen2

@ampanmdagaba adam struggles a lot less!

Hans Pinckaers @pinc@sigmoid.social

@matthen2 super cool Matt, thank you!

piccolbo @piccolbo@hachyderm.io

@matthen2 What does the y dimension represent? Great viz,

Lucas Beyer @lb@sigmoid.social

@matthen2 is the training loss as shaky as the point movement, or is it smooth as silk? How does the batch size compare to the dataset site? Are these points train or held out?
To many questions, but very cool visualisation idea!

Matt Henderson @matthen2

@lb loss is also jittery, but I do use smaller random procedurally sampled batches. In picking hyperparams I optimized a bit for producing a nice animation versus learning efficiently

Lucas Beyer @lb@sigmoid.social

@matthen2 thanks.

The real reason for my questions is that I was wondering whether this or similar could be used to gain intuition both about the effect of architecture components as well as of optimizer components.

Matt Henderson @matthen2

@lb definitely! The type of non linearity also shows interesting differences in the animation. This is relu

Lucas Beyer @lb@sigmoid.social

@matthen2 comparing different norm layers would be very interesting. Though unclear how to factor out the effect of lr.

Do you plan on publishing the code at some point? Want to know if I need to start working on repro or can just wait :-)

Connor Cadellin @cadellin@mastodon.gamedev.place

@matthen2 that's fun, it's like it's being stretched under extreme tension

Amy Magnus @cleverclue@dair-community.social

the gibbs phenomenon is apparent here

Tiago F. R. Ribeiro @tiago_ribeiro@mastodon.social

@matthen2 Hi Matt! You probably get asked this a lot, but can you share some of the code for this animation?

Drag & drop to upload