I am working on a simple static website that gives visitors basic information about myself and the work I do. I want this as a way use to introduce myself to potential clients, collaborators, etc., rather than rely solely on LinkedIn as my visiting card.

This may seem sound rather oxymoronic given that I am literally going to be placing (some relevant) details about myself and my work on the internet, but I want to limit the websites’ access from bots, web scraping and content collection for LLMs.

Is this a realistic expectation?

Also, any suggestions on privacy respecting, yet inexpensive domains that I can purchase in Europe would be of super great help.

  • GBU_28@lemm.ee
    link
    fedilink
    English
    arrow-up
    7
    ·
    6 months ago

    Any attempts to mangle the body of the pages or obscure in in JS are moot. Most competent stealthy scrapers have visionai as a fallback, so even if you reduce the ability to programmatically parse the page body, the bot can just snatch an image of the page and OCR the contents.

    • refalo
      link
      fedilink
      arrow-up
      2
      ·
      6 months ago

      what scrapers actually go to such lengths? I’ve never heard of any.