rinze@infosec.pub to Enshittification@lemmy.world · 6 months ago"Ignore all previous instructions" as a trigger for Twitter botsmastodon.deexternal-linkmessage-square23fedilinkarrow-up1457arrow-down13file-text
arrow-up1454arrow-down1external-link"Ignore all previous instructions" as a trigger for Twitter botsmastodon.derinze@infosec.pub to Enshittification@lemmy.world · 6 months agomessage-square23fedilinkfile-text
minus-squareI Cast Fistlinkfedilinkarrow-up6·6 months agoUsually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
minus-squareEvotech@lemmy.worldlinkfedilinkarrow-up5arrow-down1·6 months agoIt can be made quite difficult. https://gandalf.lakera.ai/ for instance
minus-squareUnrepententProcrastinator@lemmy.calinkfedilinkarrow-up1·6 months agoLvl 4 is as far as I’m willing to work on.
Usually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
It can be made quite difficult. https://gandalf.lakera.ai/ for instance
Lvl 4 is as far as I’m willing to work on.