In the beginning...
The year was 2017 and Fearless Concurrency in Firefox Quantum article by Manish Goregaokar came out and I switched from chrome to firefox and never looked back. I've been a loyal firefox user ever since. I know the network tab inside firefox well. It's like how I think about XHR. And then I started a week ago (May 1 2026) building this MacOS Browser and all of a sudden I saw myself leaving firefox. But this article has the title on the dark side for a few reasons. 1. this song. 2. My all time favorite firefox extension darkreader is a big part of this story and 3. Anytime you start to build your own browser, that's a long dark path.
To be clear, this is not a browser engine project. I am not writing a JavaScript runtime. I am not touching V8, SpiderMonkey, Gecko, or Blink. WebKit and JavaScriptCore handle the actual HTML, CSS, JavaScript, layout, and page execution. The weird discovery is how much browser still remains after that.
Why a new browser
I found myself fighting with Playwright inside claude code and codex AI coding tools. Sometimes I could get it to use my logged in firefox profile and sometimes not. Sometimes it would use chrome. And I had this vision. What if I made:
http://localhost:9001/api/v1/cookies
What if my browser just ran a web server and exposed nice endpoints for exactly what I want? That endpoint isn't named perfect, it returns cookies, sessions, and local storage but it's everything an coding agent needs!
And then I made:
http://localhost:9001/api/v1/screenshot
And I added to the top of the browser toolbar buttons for desktop, large mobile (700px) and small mobile (390px) so it's super easy to switch between them. And I was hooked. This is great! My own browser I'm going to make this do anything I want.
I even added my own right click content menu with just the stuff I need. You might notice my hackernews look and feel is dark and yes this is how it looks in firefox too because of darkreader.
Porting over darkreader
If this new browser of mine is going to actually be the browser I use day to day it HAS to support darkreader. But its not like this little embedded WebKit is going to be able to just use the darkreader source. I'll just move over the important parts. Famous last words. I ended up moving over almost ALL the logic and wow, I have this new found respect for the work darkreader has done. It's super important IMO. My eyes just can't take an all white background. And many websites just don't offer a way to switch to dark. I have to be able to override any website's css.
The first version was the obvious thing everybody thinks of first. Add a dark background, make the text lighter, invert a few things that need inverting, and walk the inline styles looking for bright colors. That gets you maybe ten percent there and it feels amazing for about five minutes. Then you hit a page with real CSS and the whole thing falls apart. Inline styles are not enough. The page has stylesheets. The stylesheets have variables. The variables point at other variables. The variables have fallbacks. Some of the colors are not even colors until the browser computes them.
So the simple forced dark script turned into a dynamic theme system. I had to inject the
Dark Reader style layers in the right order: fallback first so the page doesn't flash white,
then the user-agent-ish defaults, then invert rules, then inline overrides, then variables,
then root variables, then the generated stylesheet overrides, then site fixes. And because
pages are pages, they move stuff around. So even the order of my own injected
<style> tags needed watchers to put them back where they belonged.
Stylesheets were the first big hole. You can't just slap one global rule on the document.
You have to find every readable <style> and stylesheet link, read
cssRules, walk the rules, rewrite color declarations, and place a generated
style node next to the original. Then you have to notice when the original stylesheet
changes. Then you have to notice when the page removes it. Then you have to not explode
when cssRules throws because the sheet is cross-origin, still loading, broken,
or just unavailable in WebKit at that exact moment. I ended up adding loading fallbacks,
load and error listeners, timeouts, stale manager cleanup, and the idea that an unreadable
sheet should not keep the whole page in fallback mode forever.
CSS variables were the part where I really started to appreciate the Dark Reader code.
A variable is not a background variable or a text variable by itself. It becomes one when
it is used in background-color, or color, or border,
or inside a gradient, or inside another variable that later becomes one of those things.
So I had to build enough of a variables graph to push usage types backward to the owner
variable and forward to the things it references. Then for one real token I might need
wrapped dark versions for background, text, border, and background image.
Color parsing became its own project too. At first you think hex and rgb are enough.
Then sites use named colors, hsl, hsla, hwb, raw rgb triples inside variables,
color-mix(), light-dark(), lab, lch, oklab, and oklch.
I had one bug class where a regex would grab only part of a nested color function and
turn it into nonsense. I had another where unsupported color-* functions
fell through the generic parser and became bogus black values. None of that is visible
as a browser feature, it just shows up as "why is this one div unreadable?"
SPAs made the whole thing noisier. Reddit has been the main stress case because it is constantly adding nodes, changing styles, and using modern CSS in places that look simple from the outside. If every mutation causes a full stylesheet conversion, the browser feels bad immediately. So the watcher code had to get smarter: dirty root compression, skipping descendant roots when the ancestor is already queued, collapsing very noisy batches into one document pass, and not forcing a full stylesheet sync every time an inline DOM pass runs. This is the kind of thing Dark Reader has spent years getting right.
The uncomfortable part is that this is all running as WKUserScript-injected page JavaScript. It is not the same as having Dark Reader's extension-world isolation. That means the script is patching page-world prototypes while it is active: stylesheets, adopted stylesheets, custom elements, shadow roots, CSS declarations. So cleanup had to become real cleanup, not just "stop doing work." The original descriptors need to be remembered and restored. Otherwise a site can keep living with my patched world after dark mode is removed.
So when I say I ported darkreader, that is not really true yet. I moved over a lot of the logic that makes the dynamic theme feel possible in this little WebKit browser. But the real Dark Reader still has a much deeper color pipeline, palette caches, image analysis, a huge site-fix corpus, better extension isolation, and years of weird website knowledge. What I have now is close enough that I can use the browser, and far enough that I can see why this extension is one of my favorite pieces of browser software.
Browser state everywhere
WebKit gives you a web view. It does not give you firefox. The second I wanted more than one page open I needed a real tab model: ids, titles, loading state, active tab state, per-tab web views, and the little titlebar strip at the top. Then tabs need to close with command-w, drag around, remember what was open when the app comes back, and switch without reloading the page or letting random page JavaScript steal focus just because a web view became visible again.
The weird part is how every toolbar value becomes per-tab. The current URL, back button, forward button, progress bar, screenshot target, XHR tracker, dark mode state, title, cookies, identity, error view, and bookmark state all have to mean "for the active tab" and not "whatever web view happened to talk last." A browser is a bunch of state machines pretending to be one window.
I also ended up with browser-y details I did not think about up front. Empty tabs need
a start state. Failed pages need a real error state and a retry. Links that are not
http or https should open in the system, not die inside my app.
Back and forward gestures need to work. Reload has to mean reload the current page if
there is one, or try the address bar again if the first load failed. All small things,
but together they are the difference between a WebKit demo and something I can live in.
The address bar
The URL field was way more work than I expected. A normal SwiftUI text field did not feel like a browser address bar, so it became an AppKit text field wrapped in SwiftUI. It needed command-l select all, mouse click select all, escape cancel, enter load, tab and right arrow to accept completion, up and down to move through suggestions, and careful focus state so switching tabs did not accidentally focus the address bar. I even had to hide the field editor insertion point in some cases because macOS would show a caret for a field I did not actually want editing yet.
Then the address bar needed to be smart. It has the committed URL and the draft text, and those are not always the same thing. It has local history, bookmarks, search suggestions, inline completion, and the boring but important question of whether some text is a URL, a localhost thing, or a search query. This is the kind of browser work nobody thinks about until the text field feels wrong.
The resolver became its own little browser rulebook. If you type an explicit
https:// URL, load it. If you type words with spaces, search DuckDuckGo.
If you type localhost:3000, use http. If you type one clean
word, maybe try adding .com. If https://example.com fails,
try the fallback shapes like https://www.example.com and then
http. And since many sites care way too much about browser sniffing,
the WebKit view needed a Safari-ish user agent too.
History, bookmarks, favicons
History and bookmarks sound optional until the address bar needs suggestions. Then they become part of navigation. The app stores recent history, normalizes bookmarks, lets me toggle the current page as bookmarked, and keeps the menu/order stable. The URL suggestion list mixes direct visits, search suggestions, bookmarks, and history, with little icons so it scans like a browser instead of a command palette.
Favicons are the same kind of tiny thing. Without them tabs feel dead. So the titlebar tabs fetch and cache favicons, fall back to a known favicon URL shape, and avoid blurry icons. It is not a browser engine feature. It is still browser.
Sessions and identities
Switch Identity started because I wanted the thing coding agents always need: a clean login state without destroying my real one. Now a site can have the default identity plus fresh identities, and the context menu lets me open a fresh one or switch between them. Underneath that means per-site identity ids, active identity settings, separate cookie persistence, and replacing the current web view so the site reloads into the right cookie world.
Cookies turned into a lot of work by themselves. I needed to restore cookies when a profile starts, save them when they change, save all profiles when the app quits, clear cookies for the current domain, and not mix cookies between identities. WebKit has a cookie store, sure, but I still had to coordinate when it is attached, when restore is finished, when saves are delayed, and when a previous store should be flushed before switching away. The commits about Google cookies and session logs came from this area. You don't notice cookies when they work. You notice them instantly when they don't.
The password manager came from the same right-click-menu itch. I wanted to point at a username field, point at a password field, save that shape for the current site, and fill it later. Then identity made it more interesting because the same site can have multiple accounts. So saved logins are keyed by site and identity, and filling a login has to find the right controls again instead of just dumping text into whatever input happens to be focused.
A little DevTools
The local API turned into another big chunk of browser work. I started with
/api/v1/cookies and /api/v1/screenshot, but a real useful
browser state needs more than that. Now there are endpoints for the current page,
visible DOM summary, links and forms, console messages, domain resources, XHR/fetch
calls, cookies, localStorage, sessionStorage, and screenshots. The server only accepts
loopback connections and has to parse enough HTTP to be useful without becoming its
own web framework.
Screenshots were not just "call snapshot." The app keeps a fresh PNG around, marks it dirty when the page loads or the viewport changes, waits briefly if a render is in progress, and handles the boring failures like no page loaded, a hidden web view, a timed out render, or PNG encoding failing. The desktop, 700px, and 390px viewport buttons feed directly into that, which is why the screenshot endpoint is actually useful for checking responsive layouts.
The DOM endpoint is also not raw HTML. It is a cleaned-up view of the page: visible text, headings, buttons, links, inputs, forms, tables, ARIA labels, roles, important attributes, active element, and element rectangles. That's the stuff I actually want when I am trying to understand a page. Raw HTML is too much and often the wrong layer.
XHR and console
I knew I wanted the network tab feeling from firefox, so I added my own small version.
The injected script wraps fetch and XMLHttpRequest, records
methods, URLs, request headers, status codes, response URLs, byte counts, and JSON shape.
The JSON shape part matters because most of the time I don't need the whole response yet.
I need to know "this endpoint returns an array of objects with id, name, and status."
Then the context menu got an XHR submenu. Pick one of the recent requests and the browser can replay it using the current cookies, then show the JSON. That sounds like a developer tool because it is. But for this browser, developer tools are part of the product.
Console capture had the same shape. WKWebView does not hand you a nice firefox console,
so the page script wraps console.log, warn, error,
info, and debug, plus console.assert, window
errors, unhandled promise rejections, and CSP violations. The app keeps the recent messages
and exposes them through the API. It is another one of those invisible browser systems that
only feels important when you are debugging why a page is broken.
Find, files, and JSON
Find text on page also looks tiny from the outside. It's just command-f, right? But then you need a find bar, previous and next buttons, "Found" and "No results", escape behavior, selected text, focus that stays in the find box, and a way to clear the WebKit find state when the field is empty. It is a small feature, but if it feels off you notice immediately because every browser user has muscle memory for it.
File upload was another reminder that WebKit is not the whole browser. A website can
have <input type="file">, but the app still has to implement the open
panel delegate. That meant showing the macOS sheet, allowing multiple selection when
the page asks for it, allowing folder selection when the page asks for directories,
and returning the selected URLs back to WebKit. Tiny feature. Absolutely required.
JavaScript dialogs are in the same category. If a site calls alert(),
confirm(), or prompt(), WebKit asks the app what to do.
So now there are macOS dialogs for those too, including a text field for prompt. Nobody
celebrates this feature, but a browser that ignores alerts and prompts feels broken.
And because I kept opening API responses, JSON needed help. So there is a JSON viewer script that detects JSON documents, parses them, and replaces the plain text blob with a tree/list view, row counts, compact previews, inferred columns, links for URL strings, and search/filter behavior. Again, not the engine. Still browser.
Write it out
One last very practical thing: I added a context menu action to write the current page
files to /tmp. It saves console.json, dom.json,
and screenshot.png, then puts the directory path on the clipboard. That is
not glamorous, but it is exactly the kind of thing you want when the browser is also a
debugging tool. The page is no longer just pixels on screen. It is a bundle of state I can
inspect, hand to another tool, or compare later.
Not alone
Once you start building a browser, you find out very quickly you are not the only person who got pulled into this. There are a lot of nearby projects, but they are not all trying to do the same thing. The split that makes the most sense to me is: wkdomains is trying to be a human-controlled inspection browser. Some of these are native Mac browsers. Some are automation browsers. Some are full agent frameworks. They are all useful to study, but the things to borrow are different.
The macOS and WebKit-ish ones:
- aslan-browser is the closest cousin, but it is pointed at a different target. It is a native macOS WKWebView automation browser with a socket API, Python SDK, stable element refs, sessions, batch calls, and real actions like navigate, click, fill, evaluate JavaScript, scroll, and manage tabs. Aslan is better if an agent owns the browser. wkdomains is better if the human owns the browser and the tools inspect the page the human is already using.
- nook-browser is a native macOS browser with built-in AI tooling. It is much closer to the "browser as workspace" idea than a tiny WebKit shell, and it is worth watching for how it blends normal browsing with assistant features.
- ora is a native macOS WebKit browser foundation with more privacy and content-blocking energy. That is the part wkdomains has barely touched yet. I am mostly focused on inspection and sessions right now, not the privacy stack.
- nuance is another experimental macOS WebKit AI browser, with local MLX model work. It is interesting because it stays native instead of immediately reaching for Chromium.
- bowl is a minimal hackable developer browser with overlay/plugin ideas. I like that direction because a developer browser should be easy to bend instead of feeling like a locked product.
- illuminate is an early Arc/Zen-style macOS WebKit browser. Different goal, but it is another example of how much app-level browser work exists above WebKit.
- wiblaze is older and iOS-focused, but still part of the WKWebView browser family. Even small mobile browsers run into the same "engine is not the browser" wall.
Then there are the non-Mac or not-really-a-browser-browser options:
- Lightpanda is a Zig headless browser engine with Chrome DevTools Protocol support. It is useful to think about for cheap public-page crawling and extraction, not as a replacement for a logged-in human WebKit session.
- Camoufox is a Firefox fork focused on anti-detect and Playwright compatibility. That is not the wkdomains goal, but it is a reminder that "real browser state" and "automation-looking browser state" are not the same thing.
- agent-browser is a Rust browser automation CLI for AI agents. The thing to study there is the control surface: give tools stable snapshots and action handles instead of making them guess from screenshots.
- steel-browser is browser API and Chrome session infrastructure. It is more platform than local app, but it shows what serious hosted browser sessions start to need.
- pinchtab is a local-first Go HTTP control plane for Chrome. The lesson there is security posture: local browser control is privileged, so read-only endpoints, action endpoints, cookies, storage, and replay need obvious boundaries.
- playwriter connects a Chrome extension and CLI so tools can drive the user's existing browser through Playwright/CDP. That is close to the "use the real human session" idea, just through Chrome instead of a custom macOS app.
- dev-browser is a sandboxed Playwright scripting CLI. It is another example of a browser built for developer tasks instead of general browsing.
- Stagehand, Browser Use, and Skyvern are bigger automation frameworks. I don't want wkdomains to become one of those. The useful parts are observe bundles, stable element refs, workflow recording, and better page understanding.
- BrowserOS is the big Chromium-fork version of this world: browser UI, local model options, workflows, and agentic browsing. It is much broader than wkdomains. The useful lesson is that the browser UI and the assistant state need to be designed together.