Settings

Theme

Show HN: SiteOne Crawler – Single-binary CI/CD gate for web quality regressions

github.com

3 points by janreges 3 months ago · 2 comments

Reader

seovisible 3 months ago

Like it, well done! Good luck with it!

janregesOP 3 months ago

Hi HN, I'm the author. I originally built SiteOne Crawler in PHP+Swoole back in 2023. Last year I rewrote it entirely in Rust — 25% faster execution, 30% lower memory, and a single native binary with zero runtime dependencies.

The feature I'm most excited about is CI/CD quality gating. The idea is simple: crawl your entire website after deploy and block the pipeline if quality regresses.

Example:

   siteone-crawler --url=https://example.com --ci \
      --ci-min-score=7.5 \
      --ci-max-404=0 \
      --ci-max-redirects=5
Install:

   # Debian/Ubuntu repo setup:
   curl -1sLf 'https://dl.cloudsmith.io/public/janreges/siteone-crawler/setup.deb.sh' | sudo -E bash

   brew install janreges/tap/siteone-crawler    # macOS / Linux
   sudo apt-get install siteone-crawler         # Debian/Ubuntu (after adding repo)
   sudo dnf install siteone-crawler             # Fedora/RHEL (after adding repo)
   cargo install siteone-crawler                # from source, any platform

   # Windows: https://github.com/janreges/siteone-crawler/releases
This crawls every page, scores it across 5 categories (Security, Performance, SEO, Accessibility, Best Practices) on a 0–10 scale, and exits with code 10 if any threshold is breached. Drop it into GitHub Actions, GitLab CI, or any pipeline as a single binary — no Docker, no Node, no runtime needed.

Beyond CI/CD, it also does: - Offline website archiving with a built-in HTTP server for self-hosting - Full-site markdown export with deduplicated content (great for feeding to LLMs) - Interactive HTML audit reports you can email via built-in SMTP - Sitemap generation

Sample HTML report: https://crawler.siteone.io/html/2024-08-23/forever/cl8xw4r-f... GitHub: https://github.com/janreges/siteone-crawler

I'd love to hear your feedback — especially if you're already doing something similar in your CI/CD pipelines. What thresholds would you find useful?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection