When I’m writing webscrapers I mostly just pivot between selenium (because the website is too “fancy” and definitely needs a browser) and pure requests calls (both in conjunction with bs4).

But when reading about scrapers, scrapy is often the first mentioned Python package. What am I missing out on if I’m not using it?

  • @qwertyasdef
    link
    11 year ago

    Oh shit that sounds useful. I just did a project where I implemented a custom stream class to chain together calls to requests and beautifulsoup.

    • @Wats0ns
      link
      21 year ago

      Yep try scrapy. And also it handles for you the concurrency of your pipelines items, configuration for every part,…