Bug Byte puzzle here - https://bit.ly/4bnlcb9 - and apply to Jane Street programs here - https://bit.ly/3JdtFBZ (episode sponsor). More info in full descript...
It’s a “push as much data as a baby gets to train its NN” step, which is several orders of magnitude more, and more focused, than any training dataset in existence right now.
Even with diminishing returns, it’s bound to get better results.
this has “draw the rest of the fucking owl” vibes to it. especially step 3
It’s a “push as much data as a baby gets to train its NN” step, which is several orders of magnitude more, and more focused, than any training dataset in existence right now.
Even with diminishing returns, it’s bound to get better results.
that’s not how asymptotes work.
That’s not how watching the video or reading the paper works either.
Whatever.