Settings

Theme

Top JavaScript libraries for web scraping

serpapi.com

1 points by fullofdev 2 years ago · 2 comments

Reader

gajus 2 years ago

I am obviously biased, but Surgeon is by far the best abstraction for data extraction I've ever seen or have written.

https://github.com/gajus/surgeon

For context, it was created to support my previous business Applaudience. We had to build scrapers for (literally) thousands of cinema websites. At some point we migrated from custom scrips to Surgeon routines and reduced overall codebase size by 70% LoC. It was a huge time saver in terms of both writing new integrations and debugging when things go wrong.

The reality is that data extraction is a highly specialized task and you need specialized software to do it well. Tools like Surgeon can abstract a ton of complexity, but they have steeper learning curve.

I still use it whenever I need to scrape anything and you can combine it with anything that outputs HTML e.g. Playwright.

Ultimately business died (covid was brutal) and I moved on to even more exciting things, but this remains one of those technologies that I wish would have received more adoption.

  • fullofdevOP 2 years ago

    Will take a look at Surgeon. Thanks for sharing your experience. The syntax is very interesting.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection