Ask HN: How do you save and browse external interesting URLs?
As a curious developer, my knowledge is scattered between many external resources I consumed and want to keep at my fingertips: blog posts I read, Youtube videos I watched, Stack Overflow answers I read, Github repos I follow, etc. My knowledge is NOT the notes I took, but these external resources I consumed and loved.
But over time, I forget. I don't know what I know, and as soon as I need something like, I google it. For example, it could be the 10th time I google "efficient logging with Python". I may come across a link I already clicked, or not.
To me, it would be much more efficient to be able to search among all my external resources I already read and decided to keep, because it is limited to quality contents that I have already filtered, and that I already read, so that memory will activate when I read it another time.
At that point, you could tell me to use bookmarks. And it's what I do. Then 6 months later, I end up with 200 bookmarks I will not sort. And even if they were sorted, I will be too slow to find something in them with no tagging, I and I would use Google anyway.
In a ideal world, It would be easy to save and tag external resources (one click from the browser), and then, browse and find them back easily.
Do you have this feeling too, or it's just me? If so, what do you use for this? I use Wallabag[0]. Open source and self-hostable, but there is also a paid option[1] as well. That's _just worked_ for me. Has browser extensions and mobile apps, tagging and searching and in-app site rendering with links out to the original URL. I'm not sure this gets frequent development time (last iOS update was about 6 months ago) but everything feels feature complete for my setup using Firefox and Android on Linux and macOS. Haven't had any desire to switch I save everything I consider useful as an MHTML file (HTML/CSS/images in one file), which is native to Chromium browsers. This has the benefit of the page not being split into separate files as the normal save page does, doesn't break when the file is renamed like the normal save page does (on Windows at least), removes the need for bookmarks which go dead over time, preserves the original source URL in the file for later reference. There are over 10k of such files I've saved in this manner. With practice it becomes second nature to categorize and tag them just using the filename, which makes them findable within seconds. These have become like my own private search engine, without the issue of not being able to find answers to queries you know exist (which increasingly has become an issue with online search engines). In Chromium the saving to MHTML feature is enabled by launching the browser with the CLI argument `--save-page-as-mhtml` (Vivaldi browser enables this by default without any arguments needed). Firefox used to support it via excellent addons up until their Quantum update but they haven't supported it since and is a dealbreaker to my daily use of it. MHTML is almost as old as Google and was first implemented by Internet Explorer in 2001. They're identical to .eml files too (it's plain old MIME), so you can load them in your email client if you're so inclined. Not MHTML I guess and not native to the browser, but in Firefox you can use the popular SingleFile addon to save any web page into a single HTML file. Seems like a decent project, which has come up before. The sibling commenter ramraj07 would likely be interested in what seems like a new feature listed to automatically save pages once loaded, which according to that link works on Firefox for Android (mobile). Back when Firefox supported the UnMHT addon it would also work on Firefox for Android and had the nice feature of being able to customize which replacement Unicode was used for illegal filename characters (instead of just using underscores like Chromium does and SingleFile says it does). Btw the comparison table of SingleFile mentions MHTML can't be 'unzipped' to extract the resources but there is an open source program on Windows which can extract the resources called ExtractMHT[1], which later became bundled with Universal Extractor. MHTML looks interesting. Avoiding "Save as" into multiple files, I've been capturing the websites as screenshots to preserve the layout and printing as PDFs to preserve the content, and honestly it works so-so... I want this but done automatically (save every webpage I spend more than 10 seconds in) and also work on my phone. Which is impossible I suppose. There are some 'save everything' programs/extensions I seen mentioned in the past (and in this topic) but have wondered how users deal with the signal-to-noise ratio, later filtering for desired content, or whether the users find the storage requirements are worth it. Often when searching for a specific thing, be it technical queries, useful reviews, troubleshooting, etc, it can take many, many searches and looking through pages before finding something worthwhile, at which point I'll save it (and name it with keywords for future me) since the effort required to even find it is not worth repeating. I'm not sure how such save everything utilities would make that part easier when I wish to return to the relevant page in the future (ie: finding the relevant page, among all the pages that aren't relevant when they share so many similar internal keywords). I used to think like that, stored everything in Pinboard, tagged properly. I also used to use nvalt (https://brettterpstra.com/projects/nvalt/) for that as it had good search and I didn't have to switch to other tabs to search Pinboard. It felt good to "catalog" all this knowledge but in reality I never went back to it, just like bookmarks and I realized that if something is important enough I'll always be able to re-find or re-download almost everything I ever found. Not feeling like you have to catalog and store everything in personal knowledge management apps is very liberating. That's a very good point. I love the "the best solution to your problem is to realise you don't have a problem" mindset. I use a system of personal knowledge management (PKM), which I've posted about before: 1. Capture: every interesting idea that I think up or read is immediately stored in Google Keep (on mobile or laptop). It can be very rough at this point, the goal is simply to not forget. 2. Transcribe & Organize: every weekend, I go through the notes I accumulated during the week. It tends to be between 10 and 30 notes. Sometimes the note is "read this article" or "catch up on all newsletters", so understanding a single note can take over an hour. On some tough weekends the process takes an entire day, but that is invariably a day where I feel like I learned a ton. Once the note is cleaned up (transcribed), I feel like I understand it. At this point I rarely forget it - it has been absorbed into my brain. The final step here is "categorizing" the note. I classify it using OneNote with tabs like "Clinical psychology" (nested under "Psychology") or "Investment management" (nested under "Finance") or "Math" or "Physics". This way, in the future, I don't have a million notes scattered around, but one clear place I know where to look. On average, this process takes 2-4 hours per weekend. I never accumulate bookmarks, Google Keep notes or unread emails more than a week to prevent existential dread. 3. Revisit: generally, people recommend you revisit your notes from time to time. I almost never do this. But if I ever am thinking about "Marketing" or "Sociology", I have an immense, high SNR repository of everything I've ever found valuable on the topic. I've done this for software interviews and it's been incredibly helpful. Overall, I attribute this system to making me much smarter. It has been an invaluable investment. The thing is, your system requires a lot of discipline. If I had discipline, I would almost not need a tool :/ A very low effort way to do something like this is the following: 1) for any article, I save to read later. I used to use Pocket for many years, but recently switched to the new readwise reader. This makes it easy to search later. 2) if it’s a website or service/repo I want to be able to find later, I often screenshot it on the iPhone, and that’s it. The only downside is that you end up with a lot of screenshots in your camera roll. But now it’s quite easy to search for any text and the new iPhone automated OCR works incredibly well. I also sync all these images to google photos, which also lets you search for text in the images. My system is very much like this. I capture most things to Drafts app with a “to process” or “to read” tag. And then review and handle at least once a week. If it’s a resource that I want to refer back to (like how to do something or a tool I found interesting) I keep it in Drafts in a resource workspace appropriately tagged. If it’s a resource I keep and share a lot, I move it into Obsidian and write text around it to make sharing easy. If it’s a longer item — say a paper that’s building my knowledge in a subject area — I move it into DEVONThink and annotate. I do need to recall and reuse things and I find having different tools and workflows for different kinds of information helps me. For those interested in PKM, that's also how I manage mine but I follow something closer to Tiago Forte's "Building a second brain" method. I use Logseq, and I have a sandbox for the particular day. Using Tagging, I can have a #inbox #interest - or any sort of sorting, and then create queries for pages where I process that reading when it's tied to a project. A lot of them never end up being read, and as they go further down the list, they become less important to me. Our solution is so close that I put it into the github repository. I don't like to depend on any software, vscode is good enough for me. The key problem is this: there isn't that much valuable information worth recording each week. Just saving doesn't solve the problem, it requires understanding and thinking, as well as regular review. I created a section on my website for shower thoughts and link reminders.
I'm building it as I go and as I require new features. So far it's a title/desc+tags that I can search and filter. I'm planning to add a random reminder every day and some sort of "check back in X months/years if what I predict happen to be true".
I can't remember the name of the addon I had to create bookmarks in the 2000's with tags rather than folder and but it was the best organization tool I ever had (as a very disorganized person), trying to recreate what I remember from it. Oh and I'm also working on an extension to quick add links, with an AI summary on what's on the page + tagging. But that far off in the future! I've never done something like that, but after reading your post and the comments, I'm tempted to start doing so! I'm thinking about a script that receives the URL and saves its main content (e.g., using Mozilla's Readability) in a text file. It also stores the URL on the first line and maybe sends a request to Archive.org to take a snapshot of the page and adds its URL to the second line in the file. Then, whenever I need something, I can search the content of those files (I use the Silver Searcher) and find what I'm looking for. If the main content stored in the file is not enough, I can open the original URL or the Archive's snapshot. I think I won't need to categorize or tag them; searching seems enough to me. The only difficulty I can think of is that extracting the main content of pages is not easy, and, for example, Mozilla's Readability doesn't work well all the time. It may be required to have a manual process for copying and pasting the data. For me, a Pinboard account with pin-via-bookmarklet set in browser, and the app on my phone. Then an Inoreader account loaded with feeds, mark-as-read via scroll enabled. Fly thru things a couple times a week - open interesting things in a new tab. Once I've hit a critical mass of interesting articles open, check 'em out and hit the bookmarklet as needed to save permanently. Then I visit Pinboard or its feed mostly as those topics come back up again in conversation or work. I treat it as kind of a short term cache, I don't worry about organizing them - most things are interesting but maybe not necessary to revisit but I feel good to have available. Every once in a while though, it captures a gem I level up from or keep coming back to or want to share out, and that's what it's about. I take structured notes with Obsidian, then add a backlink to the resource. No bookmarking anymore because there's a probability of exactly 0% that I ever click on bookmarked resource again. Google indexes content, so you should index your own with obsidian and add semantic research on top. Sometimes I'll spend 15 minutes going through my (unordered) bookmarks and put them into folders, in the context of the things I'm working on. Some also get deleted. There's a higher likelihood of them being deleted if they can't fit into a folder shared by other bookmarks. I use Pocket[1] for this. There’s a browser extension and an iOS app; both are optimized for saving quickly. Later, I tag each item for easier search. Do you use paid version? I considered buying it but then read some reviews that their search in paid version isn't that good. One solution is to continuously record web history and then just search it. The advantage is: 1. you often don't know what resources you will really "value" in the future, so no more "to save or not to save, this is the question" 2. tagging, to be effective, requires discipline (thinking about then sticking to an agile system). So, we just replace it with search, preferably NLP/AI (so you don't have to remember the exact keywords) Apps do exist, from the expensive [1] to the experimental [2]. I run an instance of yacy (in non-distributed Robinson mode). Any interesting links I add to my index (with a crawl depth of 0 since I only want that page, not the whole site). This gives me a good search history, plus it automatically creates a cache of the page! I just set up a YaCy instance myself last week after being inspired by your post on hackernews from ~7 months ago. Thanks for sharing that. I found some other helpful resources online for this too. I wrote about my configuration in case it might be helpful for anyone else who'd like yo play with YaCy for bookmarking. I haven't had it up long enough to know how useful it will be for me yet, but its been fun to play with. I also wrote an Android app to help me index specific pages from my mobile device with just a few taps. https://www.richardosgood.com/posts/yacy-personal-search-eng... Yes, I can't believe how few good complete solutions there are for this that don't involve me curating anything. Currently I have obsidian notes for different libs and technologies and really useful stuff or things I plan on reading, they go there but search of the page itself is non-existent. For sites I go to again and again because chrome's search sucks, I tag my bookmarks in the url title with an underscore (e.g. _python _docs) and stuff them all in a folder. An underscore actually works and you can combine them to quickly find stuff. For highlights, I don't highlight much, but I use hypothesis because it looked promising, but honestly it's been very slow with any management related features. I also run a local archivebox for pages I don't want to loose. It has search but doesn't show you where the term matched. And I've been keeping an eye out on spyglass, which is a local search engine with the concept of custom "lenses" that you can create or you can get ones created by the community. It can also index local files and bookmarks. It recently fixed the shortcut issue I had on linux so I'm properly trying it out and it seems very promising. I hope to be able to hook it up to all those different services. Need to clean my older bookmarks first.... In Firefox you can add tags to bookmarks and then search in bookmarks by typing '*' as the first symbol in the address bar. In this way I always can find anything I saw and tagged appropriately. Though now I'm switching to saving bookmarks locally as folders (sometimes with cached content) named with tags and use custom scripts for searching them (fzf tool is amazing). I use a mix of tab groups and semi-organized bookmarks. The tab group is useful because if something is no longer relevant, e.g., if I’ve mastered that, closing the tab removes it from the group. While tab groups are not searchable (at least not on iOS Safari - hint, hint, Apple), bookmarks are. They take a bit more to maintain, and I find the trick is to organize them thematically, and to either prune them periodically or forget about them. My “favorites” are things I use regularly, with groups for work-related, interesting tech, hobbies, entertainment, etc., but nothing adhered to too slavishly. Bookmarks that are not favorites are basically a hoarder’s stash of things I cannot be bothered to prune but don’t feel like deleting. For pruning bookmarks, I find swiping on iOS a slightly friendlier/easier approach than clicking on the desktop. I’ve tried a few tools over time, and none really beats periodic manual curation. Save as much as you want, thematically, then clean occasionally. Hi, I have the same problem and I found a method that works for me. I bookmark or scribble into my phone memo every week that I see interesting or useful links — articles, software, frameworks, code repositories, cool websites, etc. Every Saturday I will take a few hours to review and organize these resources, and write a markdown document. I've been on it for 24 weeks and am sure I can keep doing it. Reorganizing and thinking will be more rewarding. When I need to find a resource but I can't remember it, I will turn on the computer and use vscode's fuzzy search (it may be more convenient to make an online search entry, which is enough for now). For reference: https://github.com/theseazhang/weekly_news To me, not a lot of stuff is worth including. Every week I feel that there is not much worth writing down. By reviewing by date, I can easily see the process of my growth or changing interests. I save interesting t stuff to Notion in a “My Links” database for general stuff, “Recipes”, or a “Shopping” database for consumerism. On some items I go back and write notes if I’m in the mood; the draw is quick capture + able to link between items or organize/summarize/synthesize them more later all in the same app. This works okay from all my devices. The more focused “recipes” and “shopping” DBs have really upped my food and interior design game respectively. I’ve now built up a good library of things I love to make plus my special adaptations that I can sort by effort level and tastiness. With shopping, I tend to slowly build up products for some purpose over months/years, and when I’m ready to spend I can quickly organize all my best options, sort by price, review them with my partner, to make the choice. I keep everything in Evernote. The clipper is so good that I only have to clip the part of the page I care about, whether it's comments or pictures or whole article. It has fantastic search function with OCR. I have about 5,000 notes in Evernote. I haven't found a good alternative so far. This is actually the reason why I stated a newsletter (weeklyrobotics.com). I was finding things on the Internet that were interesting, but then I could never find them again. I discovered that limiting search to my own domain made it much easier. I’m using raindrop.io to save all the bookmarks in one place, and then while working on them I go through the top of the unsorted bookmarks, and select the ones I will feature in the newsletter. This still doesn’t solve the problem of heaps of unsorted links (over 1.4k after around 3 years). If I was to do something that you describe, I would probably look into dedicating some time into organizing your bookmarks, and if taking it to the next level then I would consider using TriliumNotes for making notes and categorizing the knowledge. Thanks! I took a quick look to Raindrop.io, it seems powerful and full of features. At this point I'm wondering why they did not solve the problem of unsorted links. Since there is a browser extension, a simple solution would be to allow user to add optional tags when the user bookmarks the link. When you add a link to raindrop.io you can add it to any of your folders. Tags are yet another thing they support but I don’t use them extensively. I don't really have a centralised place where I store this stuff. I basically use the native facilities of whatever app I'm using. In Chrome, I frequently use auto-complete instead of bookmarks. I don't need to use deep links very often, just hit the front page and then sign in. In the event that I do want a deep link, I use a simple bookmark. I really don't need too many of them, but I do like my Bookmark Bar for quick and easy links to stuff. More so at work than personal accounts. I spend a lot of time in YouTube, and so I have a ton of playlists across two accounts. I have thousands of liked videos and a healthy history of viewed videos as well as searches. I have subscriptions for the best channels I watch all the time. So YouTube's organisation schemes are well-leveraged for me, in a way that can't be achieved with bookmarking or downloading locally. On Wikipedia, of course I use watchlists for stuff. I really should have some sort of calendar or TODO list, because there are a lot of tasks I abandon or forget after a short while. I am largely a reactive editor who follows up on other people editing an article with my own changes, or contributing to an active discussion. I used to use a chat bot that was able to watch for other editors' edits, but that facility was unfortunately discontinued. (Yes, it was a bit like stalking, but it is 100% acceptable to monitor and verify edits if we expect disruption; my usage was always in the best interest of the project.) I also use folders and favorites and stuff on Google Drive to organise, and if I need to jot an electronic note, I don't use Keep, I just use Docs. I have some offline Sheets and Docs that document general stuff about my life, for instance a grocery inventory and household measurements. I used to use Evernote and it was fantastic. I totally agree with the poster upthread who mentioned clipping ability with it. Evernote could totally bring your personal knowledge all together in one place, with offline capabilities. In fact, I briefly, fearlessly, used Evernote as a rudimentary account/password manager before I got a real app that encrypted the stuff! I use a self hosted instance of the Shaarli bookmark manager : https://shaarli.readthedocs.io/en/master/ JabRef - the browser extension and download function are especially useful - [1], Zettlr [2], Git and some discipline taking notes. I also email myself stuff to read later. [1] https://www.jabref.org/ | https://github.com/JabRef/jabref [2] https://www.zettlr.com/ | https://github.com/Zettlr/Zettlr I see others have already mentioned software I use for categorisation, i.e. Google Keep, Obsidian. If you like bookmarks and/or you want to organize and tag yours, I've recently found raindrop.io In Linux you can use xsel to grab the url and put it in a tagger using one shortcut, then add tags or comments. As you can see, the harder part is using what you have tagged. If you have a custom menu that you do searches with, it can show matches, and if you keep a history of searches then more frequently used links can be seen first. You can do all sorts of things with tag hierarchies and other stuff that a browser won't do, so tagging is often saner than thinking of how to make hierarchies using folders in a browser. I use the Google Keep browser extension. When I click on the extension's icon (or right-click > Save to Keep), a draft note pops up with the page link in the body of the note, as well as any text selected on the page prior to clicking the icon. From there I can add more text to the body, insert a title, and add or select a tag. Once the note is in Keep more features are available, such as attaching an image or archiving the note to move it out of the way. And the note is accessible and searchable from any device. Browser bookmarks with Netscape Bookmarks format (yuck!) for transferring between browsers when I access stuff regularly and don't want to discover it through the search engine every time. Self-hosted shortener service when I need to access URLs from any device in any location. So called "url files" which browser opens for me with every URL in a separate tab when there is temporary high volume inflow of URLs, e.g. researching a purchase, rental, or travel. Tried a lot of different approaches, from markdown to notion, to native browser bookmarks. Ended up liking https://anybox.app/ - great cataloging / sorting capabilities and plenty of keyboard control I don’t have any vested interest in this company - no hidden connection. Just happened to find a nice solution in it. Normally I use this one https://microsoftedge.microsoft.com/addons/detail/pxlet-book... After installing it, when you browse a webpage, you can right click and find the bookmark option on the menu. I was a heavy pinboard user but recently made the switch to a paid plan on https://raindrop.io/. I tested a few self-hosted OSS solutions but none of them felt right for me. Still too early to say anything meaningful about raindrop but I’ve already found it useful a number of times. I run into the same issue you do, and I just paste interesting links into Apple’s notes app since it’s convenient to use across my devices. Personally I’ve found it useful to just scroll through it time to time to refresh my memory on what I’ve read/watched/listened to. Maybe not the most ideal or fancy solution, but it works well for me, even as my note has gotten longer and longer. Yep I do the same. From time to time, I move all my bookmarks to my notes, adding a title as a description, and removing useless bookmarks. I use a classification.
It helps keep the bookmarks somewhere and kind of organised, but is very hard to search for. I have a browser bookmark script that saves it to an Airtable table. Works like a charm, and I've saved almost 10k links in the last four years I've been using it! I occasionally go back and search for a specific link from months to years ago (since I have a terrible memory), and it works great. I also built a custom Pocket-like interface which works great too! If I feel a specific command is something I need often or is too tedious to figure out how to do from scratch every time, I put it in Obsidian. For links I used to use pinboard.in with archiving, but I'm seriously considering moving to randrop.io just because pinboard's only admin is a bit too flaky for my taste. I wrote a small app to scan my comments and stories on hacker news. It would dump them into a postgres database and use the full text search to normalize the text of the articles to make for easy searching at a later date when I inevitably would think “where did I read that thing about this?” I have a "notes.txt" file that I keep at every job. I paste all the links snippets into the note, alongside with my personal note. I use the same file to jot down things I need to do, and also use it as scratchpad. i find the file to be very easily searchable, even if it has 10k+ lines. I installed a wiki on my personal site and used that to create a sort of bookmark hierarchy with descriptions. Then Google Bookmarks came along, which was perfect, so I switched to using that. Then Google killed the project for some reason and I've never gotten back into the wiki habit. I take notes in Joplin in general and also use the Joplin web clipper [1] to make sure that I have all the interesting articles I read or want to read in markdown My personal solution to this is YAGNI. In very rare situations I visit chrome://history and search, if it is only a couple of days old. But I do sorta like the idea you hint at, of a 1 click "add to my web" with its own personal webcrawler. I'd give it a try. This is basically what tools like roam and obsidian are for. When you want to capture a link, make a new note and link the external page. Write down any tags or thoughts so it is searchable. It’s basically my second brain and hugely helpful. I'm using a self-hosted Shaarli : I've been using Obsidian in order to store a "clips" folder with some front matter/tags/aliases. My life is in Evernote, and it works well now. At work we use OneNote. It also works well. That was the goal of the Memex proposed back in 1945, still unrealized. 8( RSS No need for bloated software as suggested by other.
Just make a RSS file and forget.