Neural Networks Need Data to Learn. Even If It’s Fake.

babelspace@kbin.social · 2 years ago

Neural Networks Need Data to Learn. Even If It’s Fake.

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · 2 years ago

Synthetic data was used here with impressive results: https://programming.dev/post/133153

There is a lot of potential in this approach, but the idea of using it for training AI systems in MRI/CT/etc. diagnostic methods, as mentioned in the article, is a bit scary to me.

babelspace@kbin.social · 2 years ago

Yeah, you’d better have a through way to check if there are any systematic distortions that could have an adverse effect on its operation. I do get the privacy rationale for using synthesized data, though.

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · edit-2 2 years ago

I guess if they pretrain the model using the synthetic dataset and then in a separate training phase “align” it using real data, it could work. Just like how ChatGPT was pretrained on an internet dataset and then had an RLHF phase to make it behave like an assistant rather than a generic text completion model. (Not sure if I’m using the correct terms.)

Neural Networks Need Data to Learn. Even If It’s Fake.

Neural Networks Need Data to Learn. Even If It’s Fake.

Neural Networks Need Data to Learn. Even If It’s Fake. | Quanta Magazine