Google Co-Scientist AI cracks superbug problem in two days! — because it had been fed the team’s previous paper with the answer in it

prototype_g2@lemmy.ml · 2 months ago

Google Co-Scientist AI cracks superbug problem in two days! — because it had been fed the team’s previous paper with the answer in it

PhilipTheBucket@ponder.cat · 2 months ago

I was always pretty convinced that this was at the root of a lot of those “ChatGPT can pass the bar exam” and similar things that required AGI capabilities which it clearly doesn’t possess. I think I actually tried making up similar-style math problems and asking them, and it failed horribly. Clearly I should have written up my results in a paper and been hailed as a visionary.

Endmaker@ani.social · edit-2 2 months ago

Perhaps it’s not exactly equivalent since this is an LLM, but from what I’ve learnt in my undergrad machine learning course, shouldn’t the test data be separate from the training data?

The train-test (or train-validate-test) split was one of the first few things we learnt to do.

Otherwise, the model can easily get a 100% accuracy (or whatever relevant metric) simply by regurgitating training data, which looks like the case here.

vrighter@discuss.tchncs.de · 2 months ago

but that won’t trick investors into funding more of it