doesn’t it follow that AI-generated CSAM can only be generated if the AI has been trained on CSAM?

This article even explicitely says as much.

My question is: why aren’t OpenAI, Google, Microsoft, Anthropic… sued for possession of CSAM? It’s clearly in their training datasets.

  • PM_ME_VINTAGE_30S [he/him]@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    15
    ·
    19 hours ago

    If AI spits out stuff it’s been trained on

    For Stable Diffusion, it really doesn’t just spit out what it’s trained on. Very loosely, it starts with white noise, then adds noise and denoises the result based on your prompt, and it keeps doing this until it converges to a representation of your prompt.

    IMO your premise is closer to true in practice, but still not strictly true, about large language models.