Settings

Theme

Show HN: Live VNC for web agents – debugging native captcha on Cloud Run

rtrvr.ai

12 points by quarkcarbon279 a month ago · 1 comment · 1 min read

Reader

Hi HN, Bhavani here (rtrvr.ai).

We build DOM-native web agents (no screenshot-based vision, no CDP/Playwright debugger-port control). We handle captchas natively including Google reCAPTCHA image challenges by traversing cross-origin iframes and shadow DOM. The latency is high on this one currently.

The problem: when debugging image selection captchas ("select all images with traffic lights"), logs don't tell you why the agent clicked the wrong tiles. I found myself staring at execution logs thinking "did it even see the grid correctly?" and realized I just wanted to watch it work.

So we built live VNC view + takeover for serverless Chrome workers on Cloud Run.

Key learnings:

1. Session affinity is best-effort; "attach later" can hit a different instance

2. A separate relay service that pairs viewer↔runner by short-lived tokens makes attach deterministic

3. Runner stays clean: concurrency=1, one browser per container, no mixed traffic

Would love feedback from folks who've shipped similar:

1. What replaced VNC for you (WebRTC etc) and why?

2. Best approach for recording/replay without huge storage?

3. How do you handle "attach later" safely in serverless?

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection