LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

  • refalo
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    8 hours ago

    I don’t like the approach of banning nonresidential IPs. I think it’s discriminatory and unfairly blocks out corporate/VPN users and others we might not even be thinking about. I realize there is a bot problem but I wish there was a better solution. Maybe purely proof-of-work solutions will get more popular or something.

    • sudo
      link
      fedilink
      arrow-up
      1
      ·
      3 hours ago

      Proof of Work is a terrible solution because it assumes computational costs are significant expense for scrapers compared to proxy costs. It’ll never come close to costing the same as residential proxies and meanwhile every smartphone user will be complaining about your website draining their battery.

      You can do something like only challenge data data center IPs but you’ll have to do better than Proof-of-Work. Canvas fingerprinting would work.