LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

  • sudo
    link
    fedilink
    arrow-up
    1
    ·
    6 days ago

    Admins will always turn down the bot management when it starts blocking end users. At that point you cough up the money for the extra bandwidth and investigate different solutions.

    • SkotchY@ieji.de
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      6 days ago

      @sudo yeah, the bot-problem is hard, especially for voluntary that help others.

      https://nadeko.net/announcements/invidious-and-the-bot-problem/

      * they use a proof of work system called #Anubis to fix their #bot problem. I hope it works. #proofofwork

      The proof of work right now needs about 1 second on my phone, so I am happy with that.

      Perhaps the biggest problem of bots is the number of requests they start, which is impossible to replicate by a normal human clicking on buttons.

      • sudo
        link
        fedilink
        arrow-up
        2
        ·
        4 days ago

        I’ve been criticizing Anubis and Proof of Work solutions in general. Its my speculation that they mostly work just by requiring you to execute javascript not by being an actual burden on the bots CPU.

        • FuckBigTech347@lemmygrad.ml
          link
          fedilink
          arrow-up
          1
          ·
          1 day ago

          I already stopped visiting 3 websites I used to frequent because I suddenly got redirected to Anubis and was told to enable JS. All my browsers have JS disabled by default and are configured to never keep any local storage/cookies out of principle. I won’t change that just to get to a website that is mostly static HTML. Even on the occasion where I bite the bullet it takes between 5 - 30 seconds to complete the PoW challenge on my main machine. I hope this trash doesn’t get popular. I really doubt that this makes much difference to more sophisticated crawlers that run on enterprise hardware.