• Mikina
    link
    fedilink
    English
    arrow-up
    9
    ·
    edit-2
    2 days ago

    Does this mean that if I turn on their AI crawler block, it will still allow crawlers that simply paid them?

    I guess it’s time to finally set up a tarpit. Keep it behind cloudfare, of course, so not only they are getting shit data, they are also paying for it.

    EDIT: Of course, reading the article had the answer. You can choose to allow/block/block unless paid. Also, you get a cut from the crawlers, so it’s not some Cloudfare scheme, but a way for people who don’t mind AI crawling their websites to get something out of it.

    Which makes it an interresting legal question. If I set up a tarpit of markov-chain bullshit, and someone pays me for access to crawling it, is it a fraud?

    • CameronDev
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 days ago

      Surely its not fraud if you are upfront about it? IANAL.

      Also, would love to see ChatGPT go to court and make the claim that “charging for computer generated human like sentences” is fraud.

    • Michal
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      It’s up to site owners to decide the price and the crawler operators decide if the content is worth it or not.

      They won’t crawl your site if it has a steep price, and the few cents you will receive won’t make it worth your while. I wouldn’t call it a fraud.

  • sobchak
    link
    fedilink
    English
    arrow-up
    6
    ·
    2 days ago

    Is this for “AI crawlers” or all crawlers; how would they even know or determine the difference? This could put up major roadblocks to various researchers, open source search engine projects (like Yacy), or just individuals gathering statistics and such. Seems contrary to the idea of an open web.

    • refalo
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      2 days ago

      I think CF is already a major threat to the open web.

      A large majority of sites they protect, I cannot even visit. Most of the time I get infinite captcha loops, other times it just straight up says “you are blocked”, and that’s without any proxy or VPN. But for people in areas that need to use such products, it threatens their access even more.