Neural Networks Need Data to Learn. Even If It’s Fake.

babelspace@kbin.social · 2 years ago

Neural Networks Need Data to Learn. Even If It’s Fake.

babelspace@kbin.social · 2 years ago

Yeah, you’d better have a through way to check if there are any systematic distortions that could have an adverse effect on its operation. I do get the privacy rationale for using synthesized data, though.

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · edit-2 2 years ago

I guess if they pretrain the model using the synthetic dataset and then in a separate training phase “align” it using real data, it could work. Just like how ChatGPT was pretrained on an internet dataset and then had an RLHF phase to make it behave like an assistant rather than a generic text completion model. (Not sure if I’m using the correct terms.)

Neural Networks Need Data to Learn. Even If It’s Fake.

Neural Networks Need Data to Learn. Even If It’s Fake.

Neural Networks Need Data to Learn. Even If It’s Fake. | Quanta Magazine