programming.dev
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
cm0002@lemmy.world to Programmer Humor · 6 months ago

APIs vs Web Scrapers

lemmy.ml

message-square
10
link
fedilink
  • cross-posted to:
  • [email protected]
415

APIs vs Web Scrapers

lemmy.ml

cm0002@lemmy.world to Programmer Humor · 6 months ago
message-square
10
link
fedilink
  • cross-posted to:
  • [email protected]
alert-triangle
You must log in or # to comment.
  • HappyFrog@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    46
    ·
    6 months ago

    As long as the scrapers follows robots.txt

    • Jankatarch@lemmy.world
      link
      fedilink
      arrow-up
      37
      ·
      6 months ago

      It’s equivalent to “the code.”

      • kautau@lemmy.world
        link
        fedilink
        arrow-up
        23
        ·
        6 months ago

      • dejected_warp_core@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        6 months ago

        It really should be “parlay.txt”.

  • TropicalDingdong@lemmy.world
    link
    fedilink
    arrow-up
    26
    ·
    6 months ago

    beautiful soup

  • mspencer712
    link
    fedilink
    arrow-up
    15
    ·
    6 months ago

    I feel like there should be a third box with Wall Street raider types, for scrapers that use Selenium browser automation.

    I don’t think it’s entirely unblockable - adsense seems to know to only serve unmonetized PSA ads - but I think it’s very difficult to discriminate between “this is a real browser controlled by an end user” and “this is a real browser being controlled by automated test software”.

    • erytau
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 months ago

      Fourth panel as well, with those bots collecting data for AI training that don’t respect your robots.txt, change user agents and overload your servers

      • dejected_warp_core@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        6 months ago

        War boys from Fury Road?

  • Kojichan@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    6 months ago

    I just recently seen a python scraper in my server logs earlier today. Strangest thing to see.

  • shiroininja@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    6 months ago

    Love me some Scrapy spiders

Programmer Humor

programmer_humor

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

  • Keep content in english
  • No advertisements
  • Posts must be related to programming or programmer topics
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 917 users / day
  • 5.01K users / week
  • 9.22K users / month
  • 18.8K users / 6 months
  • 2.57K local subscribers
  • 27.7K subscribers
  • 1.95K Posts
  • 75.7K Comments
  • Modlog
  • mods:
  • adr1an
  • Feyter
  • BurningTurtle
  • Pierre-Yves Lapersonne
  • BE: 0.19.13
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org