Bing’s GPT-4 With Image Input Can Break Captchas

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · 2 years ago

Bing’s GPT-4 With Image Input Can Break Captchas

DreamySweet@vlemmy.net · 2 years ago

I love when it tells you it can’t do something and then does it anyway.

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · edit-2 2 years ago

Or when it tells you that it can do something it actually can’t, and it hallucinates like crazy. In the early days of ChatGPT I asked it to summarize an article at a link, and it gave me a very believable but completely false summary based on the words in the URL.

This was the first time I saw wild hallucination. It was astounding.

Phoenix · 2 years ago

It’s even better when you ask it to write code for you, it generates a decent looking block, but upon closer inspection it imports a nonexistent library that just happens to do exactly what you were looking for.

That’s the best sort of hallucination, because it gets your hopes up.

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · 2 years ago

Yes, for a moment you think “oh, there’s such a convenient API for this” and then you realize…

But we programmers can at least compile/run the code and find out if it’s wrong (most of the time). It is much harder in other fields.

vcmj · 2 years ago

I’ve not played with it much but does it always describe the image first like that? I’ve been trying to think about how the image input actually works, my personal suspicion is that it uses an off the shelf visual understanding network(think reverse stable diffusion) to generate a description, then just uses GPT normally to complete the response. This could explain the disconnect here where it cant erase what the visual model wrote, but that could all fall apart if it doesn’t always follow this pattern. Just thinking out loud here

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · 2 years ago

Unfortunately I don’t yet have access to it so I can’t check if the description always comes first. But your theory sounds interesting, I hope we’ll be able to find out more soon.

jim_stark · 2 years ago

Does one need the app to upload an image? I use it in a web browser and don’t see any option to upload an image.

𝕊𝕚𝕤𝕪𝕡𝕙𝕖𝕒𝕟 · 2 years ago

It’s still in preview, I can’t access it either, only a select few on Twitter who post about it all the time and make me jealous:)

monerobull@monero.town · 2 years ago

They need to make captchas better or implement PoW. Telling your ai to not solve captchas is stupid and makes it dumber in unrelated tasks just like all the other attempts at censoring these models.