• elliot_crane@lemmy.world
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    10 months ago

    The tagline is really poorly written IMO. From reading the README, this doesn’t outwardly appear to be a tool for bypassing an artist’s choice to use something like Nightshade, but rather it seems to detect if such a tool has been used.

    I’m assuming that the use case would be to avoid training on Nightshade-ed images, which would actually be respecting the original artist’s decision?

    • tyler
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      3
      ·
      10 months ago

      I read the whole thing. I understand it’s for detecting use of nightshade, not bypassing it. What other even slightly ethical use for this is there besides trying to make sure you don’t train on a poisoned image? These models are clearly not asking for permission first, else you’d never need to do this, so they’re just taking an image, assuming they’re allowed to use it, and then using this tool to detect if it’s going to poison their model.

      • elliot_crane@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 months ago

        I don’t think most people are collecting images by hand and saying “ah yes I’m just gonna yoink this and use it in my model”. There are a plethora of sites for sharing repositories of training data, and therefore it’s pretty easy for someone training a model to unknowingly pull down some data they don’t actually have permission to use. It’s completely infeasible to check licensing by hand on what could be millions of images, so this tool makes it easy to simply not train on images that have gone through Nightshade. I fail to see how that’s unethical, as not training on the image is the whole reason the original image was put through Nightshade in the first place.

        • tyler
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          10 months ago

          it’s completely infeasible

          Then it shouldn’t be done. That’s the unethical part. Trying to just avoid the problem by continuing to scrape large data sets for images that you shouldn’t be using is the entire problem. Either get permission for each image or don’t build your image model. Doing otherwise is unethical.

          • elliot_crane@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            10 months ago

            Again, in many instances, folks training models are using repositories of images that have been publicly shared. In many cases the person/people who assembled the image repositories are not the same person using them. I agree that reckless scraping is not responsible, but if you’re using a repository of images that’s presented as ok to use for AI training, I’d argue it’s even more ethical to strip out the Nightshaded images, because clearly the presence of Nigthshade means you shouldn’t use that one. I guess we’re just going to have to agree to disagree here, because I see this as a helpful tool to specifically avoid training on images you shouldn’t be.