Show HN: Which is faster? Puppeteer, Playwright or Selenium
colab.research.google.comHey Everyone, I just ran a [rather silly] race between Puppeteer (JS), Playwright (Python) and Selenium (Python) to see which one would be fastest on a simple scrape (using Google Colab so you can also run it)
Far from a comprehensive benchmark, this race is 100% free from advanced configurations, multi-threading or anything complicated. It just opens Wallapop (a second hand marketplace in Spain) and times how long it takes to extract the first 2000 results of a search.
If you like this simple format, have any ideas on how to improve a race like this or have a strong urge to prove Ward Cunningham wright, let me know in the comments! - Language Choice (JS vs Python): Puppeteer in JS and Playwright in Python showed near-identical performance on an AWS c5.large instance. This negated the need to test Puppeteer and Playwright in the same language for this comparison. - Playwright Scrolling: To emulate a user experience, all three tools employed infinite scrolling, which was necessary since Wallapop doesn't have pagination, you have to scroll to get results. - Explicit Timeouts: Used for greater stability, especially when contending with network inconsistencies. Initially, I used API response events for triggering scrolls, but this approach was less reliable. - Evaluate vs. Locators & Click: My initial tests indicated evaluate was marginally faster than locators and click. I appreciate the scrutiny and I might include a JS vs Python comparison in a future test. I think the approach misses the point. Playwright's auto-wait with `locators` is what makes it worth adopting because it means you don't need to use fixed waits. Auto-waits save much of the idle time waiting. Man I love puppeteer
Lots of questionable design choices here. > Puppeteer (JS), Playwright (Python) and Selenium (Python)
- Why not all JS or all Python?
- Why is the Playwright code scrolling?
- Why is the Playwright code using explicit timeouts?
- Why is the Playwright code using `evaluate` rather than `locators` and `click`?