Last week Google generously deigned to drop a 4GB Nano model onto our machines. So I naturally created a browser extension that uses that local model to translate LinkedIn posts into simple, mostly dumb, occasionally heartfelt statements.
Why
I read some complaints that Chrome was eating 4GB of disk for AI nobody asked for. I has a learning project in mind and figured I might as well make it “useful.” I’m normally a Firefox user, but I was curious what this LLM could do, and also how this would be exposed via Web APIs.
I also guessed that a dumb local model would be a good fit for a dumb local project. This was tinkering. I wasn’t trying to get anywhere in particular. I had more pressing things to work on. But I did this instead.
How you call it
The model is called Gemini Nano. Extensions and frontend code call it through a Prompt API. This stuff was initially introduced over a year ago. It’s just getting news lately due to the model download.
The API is small. Check that the model is there, create a session with a system prompt, call prompt():
if ((await LanguageModel.availability()) === "unavailable") return;
const session = await LanguageModel.create({
initialPrompts: [{ role: "system", content: SYSTEM_PROMPT }],
});
const reply = await session.prompt(postText);
There’s a clone() so you can fork a session, so each post gets a clean slate:
const fresh = await session.clone();
const reply = await fresh.prompt(postText);
fresh.destroy();
If the model hasn’t been downloaded yet, create() triggers the download and resolves once it’s done. You can hook a monitor to get progress events:
const session = await LanguageModel.create({
initialPrompts: [{ role: "system", content: SYSTEM_PROMPT }],
monitor(m) {
m.addEventListener("downloadprogress", e => {
console.log(`download: ${(e.loaded * 100).toFixed(1)}%`);
});
},
});
My extension does nothing with these events. If the model isn’t there, the first call just awaits while a few gigabytes come down in the background. An enterprising 10x developer on a productive day would put a little progress bar in the UI. Apparently I am not that developer, and today is definitely not that day.
The system prompt itself is mostly me haranguing the model to be dumb. The full prompt is on GitHub.
Follow me for more prompt engineering best practices.
What I learned
Local inference is exciting
This was the first real time I spent with a modern, small, good model. I’d understood the advantages in theory, like that it’s local, private, fast, and cheap. But actually playing with it shifted my perspective. From least to most important:
- Surprisingly fast. My macbook m4 is not low end hardware, but I still didn’t expect it to zip through a feed of LinkedIn posts in seconds.
- Low stakes. No rate limits, no bills to worry about. Costs me microcents of electricity and a bit of patience.
- So, so simple. No keys, no auth, no workers, gateways, backends, no logs some company might look at. For a private, low-stakes thing like rewriting your own feed, this is a near ideal fit.
- Creative constraints. This is the part I find most interesting. A 4GB model is too limited to be a great conversation partner. You can’t be friends with Nano. It’s not going to write a good essay. But those limitations point me to use cases I find exciting: small inference jobs that smooth out our experiences. Tight local tools instead of big cloud agents.
Simpler language reveals humanity
Among the b2b sales tips, I spotted authentic posts with genuine sentiments, and at times even some vulnerability. People lose jobs, or ask for help finding jobs. Or they want to talk about colleagues who have passed, who they miss.
I was relieved to see that when I apply the model to posts by people who care and are trying to express something genuine, it didn’t come across as glib or mean. Seeing people’s thoughts in super simple form is surprisingly sweet and gentle.
I can hear the inference
My computer’s fans spin up in a distinctive way when the extension is looping through the batch of posts, running the model on each. The first time I heard it, I laughed because it reminded me of the way I sometimes debugged in college using printf("\a") bells. Computation you can hear!
I am not sure I like this in a browser API
Any website can call it. The model is sitting on my machine, causing it to make weird little noises. Google decided that it’s callable by anyone whose JS you happen to execute. It should probably be a permission, like camera or microphone. The API is in Origin Trial, which apparently means it’s already callable on real users by any site with a registered trial token. Stable is supposed to ship sometime in late 2026 or early 2027. Maybe there will be a permission gate before then? But the way it looks right now is that Google is quietly slipping a few GB of model onto everyone’s host with the idea any site can eventually invoke it. I’m using it because it’s there and it’s interesting. I’d be more comfortable if the site asked the user first.
Terseness is unnatural for models
I’ve seen this before actually, and this was another case of it. “Brevity is the soul of wit” when there’s no wit.
It was hard to make the model just do the dumb straightforward thing. Nano really wanted to be helpful with a tidy summary. Getting it to just say “Some people made a thing that is good” took more prompt-tuning than I expected, and it still isn’t quite what I was aiming for.
Scraping DOMs sucks
The selectors are… best effort. It took fiddling with this obfuscated DOM trying to find the parts to translate. Something I want to come back to: could a local model dynamically identify semantically meaningful text instead of brittle CSS selectors?
Perhaps this is our GPUtiful future:
#root section [prompt≈"insipid thought leadership content"]
Is this OK?
It’s a learning project, open source, MIT licensed, all local, not going in the Chrome Web Store. But these days, who knows? It’s possible this is against someone’s terms.
If the serious people prefer we don’t do things like this, they can drop me a friendly note.
Try it
repo: mono-koto/linkedin-translator
Warning: It’s dumb. It will certainly break.
But if you want to try it, get the zip. You’ll need to sideload it. Directions in the README.
Then: set some Chrome flags, wait for the model to download, set extensions to dev mode, sideload the unpacked extension, open your favorite professional networking website, voila.