Settings

Theme

Show HN: Pinbot – An extension to privately search one's browser history with AI

getpinbot.com

98 points by klavinski 3 years ago · 40 comments · 1 min read

Reader

Hello HN, I’m Kamil.

The past months have been filled with news about ChatGPT, Bard, etc. Thankfully, there are some heroic attempts to bring that power to the users.

I wanted to contribute to that effort with my side project, an extension for Chrome: it makes searching the history by meaning – instead of the exact words – possible.

This is only a proof of concept, building on the excellent transformers.js[0], and running entirely in the browser. My goal here is to explore the possibilities unlocked by a client-side AI.

I would love to have your feedback, to know which direction that project should follow!

[0] https://xenova.github.io/transformers.js

kej 3 years ago

Any plans for a Firefox version? There are dozens of us.

  • klavinskiOP 3 years ago

    I targeted Chrome for the proof of concept because of its market share, but to be honest, I use and prefer Firefox as my default browser.

    Manifest v3 is a remarkably hostile development environment: Google knows that people block advertisements with extensions and want to limit their scope. I had to adapt a lot of code, and as such, making it work cross-browser would require more than a few changes. But I understand your point and want Pinbot to run on more browsers, especially Firefox!

havaloc 3 years ago

Update. Unfortunately I noticed my Mac getting hot, and I checked the activity monitor, energy impact was several thousand (whatever units Apple uses). Typically energy impact of Chrome is 120 or so. I removed the plugin (which also crashed once) and as soon as I did, the energy use dropped back to normal.

Freebytes 3 years ago

I would like something similar to VoidTools Everything that allows a person to search their own computer using AI to generate answers instead. (But it should not send any information to third parties such as OpenAI.)

  • klavinskiOP 3 years ago

    VoidTools Everything is truly an excellent tool! I have indeed considered making a desktop version. It is more complex than the current proof of concept, but a private one-stop shop for AI search is definitely a great vision!

scetron 3 years ago

Sadly not related to the 1990 NES Game (https://en.wikipedia.org/wiki/Pin_Bot_(video_game))

  • classichasclass 3 years ago

    Or the Williams original, which was the first pin I ever played (at the local roller rink, no less).

havaloc 3 years ago

I think you can only search pages that you visited after you installed the plugin, is that correct?

  • klavinskiOP 3 years ago

    Currently, yes. Allowing a user to crawl his recent history just after installing the extension is a great idea! I added it on the Discord server.

visarga 3 years ago

Does it work for PDFs too? It would be amazing to find any paragraph.

I noticed it doesn't remember tweets directly, just twitter.com as url if you check the feed. It's hard to find again a tweet at that location.

  • klavinskiOP 3 years ago

    It does not work yet for PDFs, but I agree it would be amazing. Full-text search for one's library and documents!

    Thank you for reporting the bug regarding Twitter. I will investigate.

    • squarefoot 3 years ago

      Probably a separate application extendable with plugins to search different media would be a better solution than a browser based one. Say an user wants to find a song suitable for a certain mood while reading a book on a certain subject, then one day someone could add a plugin that connects to the house IoT network to choose the best lighting and aromas based on the same information used by the above plugins, etc. The point is that the user input might not be a text line but a combination of it plus other data obtained by sensors like weather, temperature, .. pretty much everything, heart rate, etc.

jimmySixDOF 3 years ago

Hey Nice idea I like the concept as someone who does sometimes go back through old history files to find that site I was on last week/month and knows how frustrating it can be as an experience.

One question I have is about the persistence of this extension when Im not using it and influence on other browsing loads like if I visit a WebGPU heavy shader site will having Pinbot installed drop my available framerate for example ?

Otherwise its a great Idea and will definitely put it on at least one machine I use so thanks for putting this out into the world and good luck with it !

  • klavinskiOP 3 years ago

    The extension uses an SQLite database in the Origin Private File System[0]. Disabling the extension keeps the database, while removing it deletes the database.

    Regarding performance, here is how it works: the extension accumulates page changes (thanks to a Mutation Observer[1], so I do not have to regularly read and compare the page) for some time, then checks if the sentences are in the database. Only unknown sentences are converted to embeddings.

    The extension is CPU-only currently (WebGPU support was not merged yet in transformers.js), so it may be slow. I understand your concern, while that is a proof of concept, I consider a good performance to be vital to a good user experience.

    [0] https://developer.chrome.com/blog/sqlite-wasm-in-the-browser... [1] https://developer.mozilla.org/en-US/docs/Web/API/MutationObs...

  • dublinben 3 years ago

    Firefox is much better at resurfacing sites that you've been to before. There's even a built-in address bar search shortcut of '^' which searches within just your history.

    Chrome is obviously incentivized to push you to making a Google search anytime you're trying to find something, instead of looking within your browser.

int_19h 3 years ago

This is a great idea, but one thing that I immediately wished for is that it's not tied to a specific browser instance or even a browser. I'd much prefer some kind of central indexing server that maintains the embeddings along with metadata from all the various sources and allows querying, setting retention periods etc, and with extensions like this one transparently feeding data into it.

mrwnmonm 3 years ago

How is that private?

You are fetching the browsing data, push it back to your server to be fed to AI, then receive queries.. right?

  • klavinskiOP 3 years ago

    No, that is what I find exciting! The AI model runs entirely on your device, and your data is never sent anywhere. You can inspect the Developer tools of the extension if you are interested: it works offline!

havaloc 3 years ago

I like this a lot, but I would like it to be easier to:

Prevent it from indexing sites I say, such as my banking website, etc. I think that should be a top priority. And having a strong privacy policy on the site (I know you say it's local, etc), but this is pretty great and I am already enjoying it.

  • klavinskiOP 3 years ago

    Thank you the for the idea! I will add "allow a user-defined forbidden list of websites" to the ideas on the Discord server.

    Regarding the privacy policy, you have a point: I did not put one on the website, as everything works offline, but people may indeed look for one.

quickthrower2 3 years ago

That is very useful. It seems to only work on sites you have visited since installing. It would be nice if it could index your current history at installation time, even though it wont have access to the contents of those pages (probably).

luke-stanley 3 years ago

This seems cool but to aid trust in these sorts of things, perhaps a UserScripts version rather than a packaged Chrome extension with easy to read code would be an easier ask? Tampermonkey and Greasemonkey etc still work.

  • klavinskiOP 3 years ago

    I often prototype in Greasemonkey myself, so I understand and agree with your point. However, there are many requirements (the AI model weighs 90Mb; I run it in a sandboxed iframe because it uses `eval` and I want to guarantee it would not do something bad; but initialising the iframe and loading the model on every page would be quite cumbersome; etc.) that made more sense in an extension.

visarga 3 years ago

Does it keep history only for 2 weeks?

> The current extension keeps your history for only two weeks. Accounts keep all your history and synchronize it across your devices, while maintaining your privacy. Upcoming in a future version!

  • klavinskiOP 3 years ago

    Yes, for the proof of concept. It is an arbitrary limit, as I am not completely sure how people would use the extension: it may fill the users' storage too quickly. In the future, I may consider adding a counter instead (removing websites which have not been visited/searched for X days).

    • visarga 3 years ago

      After an hour my laptop was about to crash so I uninstalled it. It was using 8Gb of ram out of 8.

      • klavinskiOP 3 years ago

        That information is very important for me, thank you.

        Currently, to avoid computing the embedding of a sentence twice, I put them in a JS Map as a cache. I will find a way to empty the cache.

swarfield 3 years ago

Did anyone else think Pinball?

super256 3 years ago

Would be great if you could create this for bookmarks.

  • klavinskiOP 3 years ago

    How would you want to do it? Among the fields, have a checkbox "search among the bookmarks"?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection