Settings

Theme

Ask HN: What features for an offline Hacker News reader?

57 points by ers35 6 years ago · 39 comments · 1 min read

Reader

HN is a treasure trove of information. The primary way I read HN is to bookmark stories and read the comments when the discussion is complete. Others actively participate. Some get jobs via "Who is hiring?". We all use HN differently.

There are ways to read HN other than this website. However, I have not found one that meets all of these requirements:

  - Self-hosted.
  - Offline access of data.
  - Query data via SQL.
  - Full text search of stories and comments.
  - Notification of replies to comments.
Some ideas:

1) A tool that maintains a copy of the HN API[1] in an SQLite database, with indexing and full-text search[2]. This supports the development of the other tools.

2) A command line tool that demonstrates how to use the database and supports the development of scripts. For example:

  #!/bin/sh
  # Run notify whenever someone replies to a comment.
  $ hn replies username | notify
3) A web UI for browsing and searching. This can be hosted locally or on a remote server.

What features interest you?

  [1]: https://github.com/HackerNews/API
  [2]: https://www.sqlite.org/fts5.html
markus_zhang 6 years ago

I'm thinking about the same thing i.e. building a personal HN reader with Python and QT.

I have never explored the API, and I'm not a professional programmer, but allow me to describe the ideas from top of head when I read your posts:

Requirements:

1 - Self-hosted: So essentially, it's too heavy to keep the whole sites offline. However, because I always read by topic (e.g. I'd type "SICP ycombinator" in Google and read the top pages), I think one approach is to let the user enter a topic, say "SICP", as well as number of top-level stories to return, say 25, and invoke the API to return stories. The app will then dump the stories and their comments into a database with data modelling that suits for a forum.

2 - Offline access of data: Essentially what I mentioned above. The app should also allow user to remove a story (and its children), modify topics, favorite a story and create new topics. I think those are the barebone requirements. The backend would interact with the database and do updates.

3 - Query data via SQL: I think it might be too much work to parse queries, and the easiest thing to do is to just access a string as query and pass that to the database engine. Or maybe only allow user certain actions and let the backend assemble the queries.

4 - Full text search of stories and comments: Not sure what to do as my programming knowledge is very limited. I saw the second link you provided and it is very interesting.

5 - Notification of replies to comments: Maybe give user a button to update all his favorite stories.

TBH I really want to see what other people's implementations will be.

  • ers35OP 6 years ago

    I prototyped this in Qt 2 years ago: https://ers35.com/files/hackernews-qt.png

    1) It is possible to keep the whole site offline. A database from 2017 is 9 GB: https://archive.org/details/hackernews-2017-05-18.db I think a 2020 DB could be less than 20 GB.

    2) My focus is on reading, not writing. Local favorites make sense. Maybe with importing public favorites. A user can set their name without logging in.

    3/4) SQLite does the heavy lifting here.

  • zzo38computer 6 years ago

    "I think it might be too much work to parse queries, and the easiest thing to do is to just access a string as query and pass that to the database engine." -- You could also use SQLite virtual table; this allows combining it with other data in the same query; SQLite will then automatically convert your query into a list of constraints, ORDER BY clause, etc, and pass the stuff to your xBestIndex method, which can decide which things the server is capable of using and pass those to the back end, letting SQLite handle the rest of the query itself.

  • stevenicr 6 years ago

    This sounds like an rss reader in a way - subscribe to topics, it fetches them at set whatever intervals, can fave / save..

    Option to download a copy of the article the hn discsussion is about and create a thumbnail..

geoah 6 years ago

I'd be very interested in something that can cache the actual article/page from the main HN link as well as a reading list.

There are times when I add HN articles to mobile chrome's "read later" list, and when I have time go through them. Being able to read them later when not having internet access would be amazing (London's tube is annoying when it comes to internet, internet only exists in the stations).

  • TripleFFF 6 years ago

    I believe squid can do this, I experimented using it on a raspberry pi to cache HN locally so it could be accessed offline, it managed to get the front page ok but I honestly don't know enough to get it working, and the hardware I was using wasn't up to the task

  • timdeve 6 years ago

    That is basically why I built my side project: https://github.com/TimDeve/rasasa

  • ers35OP 6 years ago

    Great idea. This could also be useful to post a mirror for other users if the site ends up going down.

Tomte 6 years ago

A kill file. De-emphasizing comments I have already seen. notifying me of replies to my comments. Notifying me of my submissions that got re-upped while I was sleeping, so I can see the discussion.

  • specialist 6 years ago

    I've been missing twit filters for 30 years.

    Source: Ran hub for a BBS network. All the offline readers, like SLMR, had twit filters.

tylerjwilk00 6 years ago

I recently built an HN client [1] to scratch an itch and try out the HN API. It caches data to SQL and redis.

It has never been shared and I only use it personally so expect lots of bugs.

I may open source the code after I clean it up.

[1] http://app.hackerdelivery.com/

mlang23 6 years ago

If up/downvoting would work in Lynx, I wouldn't actually need a CLI tool to read HN. Well, I guess you can't have anerything.

I wouldn't need offline mode, but a CLI tool to read HN would be very much welcome.

stevenicr 6 years ago

I'd like to be able to move a discussion thread to the bottom, or collapse a discussion thread. Perhaps sort by timestamp the comments option as well.

Sometimes the scroll gets real and I want to get past a discussion to see other angles of whatever subject - scrolling trying to find where thread 2 ends and a different subject comes up is challenging at times - so I often go to a new tab on a different story and don't come back.

Even click/tap to draw a line from comment 2 to all sub comments so I can find where comment 3 starts would be helpful.

saxelsen 6 years ago

I use the Android app Materialistic, which I really enjoy.

In particular I like the fact that I can bookmark posts and comments with it.

I very often jump to the discussion to read opinions - there's often a treasure trove of stuff there of someone who knows something more than what the article states.

A really annoying, missing feature for the discussion/comments parts, is that the threads are extremely difficult to tell apart and traverse. On a small screen, the threads are very close together and only distinguishable through color differences. Scrolling through them is a nightmare, whereas I've really come to love the Reddit app's feature with a button that allows you to scroll to the next sibling comment, e.g. if you want TO quickly see the next comment to the OP, instead of scrolling endlessly and not knowing how far along you are.

rcarmo 6 years ago

Definitely sane, responsive thread layouts with more readable text. HN has, let us say, a very “traditional” approach to HTML, and it is _very_ hard to read on some devices (I wish that someone revised the markup accordingly).

Which is why I usually read it on https://hackerweb.app/ instead.

Forking https://hackerweb.app and adding offline support would be a good starting point, IMHO.

gitgud 6 years ago

The query api at: https://hn.algolia.com is pretty amazing, searches all stories.

The biggest qualms for me would be; notifications of comments and searching my previous comments.

Other than that Hacker News is actually highly reliable and available and I've never personally needed an off-line version (except maybe on a plane)

  • jillesvangurp 6 years ago

    Actually would love to tune the ranking of search as it is biased towards ancient stuff and I always need to switch to sort by date to find the article I actually vaguely recalled seeing recently. In Elasticsearch I would probably add some function scoring on dates, points, and number of comments. Also comment search is indeed not really a thing currently and there are some nice opportunities to find related stories, likely duplicates, etc. that might help people when posting new stories.

    IMHO the default HN ui has many quite obvious UX issues on mobile with e.g. tiny click targets that are positioned close together, long threads that are hard to navigate, low contrast colors (grey on grey), etc. It's clearly not been designed with mobile in mind and even on desktop it's not all that good. Brutalist/retro is the best you can say about it and that looks like it's intentional to me.

    If you read HN on the way to work in an area that doesn't have great mobile coverage in e.g. subway tunnels, etc. offline support would be pretty nice.

    There are various Android clients that do some interesting things with the UX (e.g. thread navigation, swipe to upvote, etc.) and offline usage. However, most of them fall a bit short when it comes to things like commenting or other features. Most of the clients I tried did not cover all the features.

  • ers35OP 6 years ago
folmar 6 years ago

I doubt we need another reader program. What I would really see is a two-way translation to NNTP, and the user could use any client and addons they want - it's old and popular, you get pretty much everything possible off the shelf, and it quite matches the semantics of HN (minus the voting).

rchaud 6 years ago

One thing I'd find useful:

- save individual posts as "favorites", and allow users to tag them somehow (e.g. 'finance', or 'front-end') for easy searchability. The saved quote should be available for offline reading, and should link back to the original topic. The original topic doesn't need to be available offline.

SkyMarshal 6 years ago

Don’t forget to include the basics - upvote, downvote, unvote, comment, submit, save submission, save comment - and have all that sync when reconnecting to internet.

  • ers35OP 6 years ago

    I've been focusing on reading, but it looks like there is interest in writing too. The HN API doesn't support writing, so the username and password has to be stored. It's important that reading keeps working even if HN changes their backend in a way that breaks writing. I'll think about it.

anotheryou 6 years ago

- rip links (at least article's text)

- mark links and comments as seen

- hide seen

- fold comments to only show top level

- favorite links (preferably to a list that is easily dumped in to e.g. pocket with a few buttons for that)

caseyf7 6 years ago

Show only top-level comments so you can skip long off-topic discussions and drill-down on the discussions you are interested in.

contingencies 6 years ago

Suggest considering implementing a bridge to NNTP, or SMTP mail spools. Then use offline NTTP or mail readers.

  • zzo38computer 6 years ago

    Yes, I think a bridge to NNTP would also be helpful. The ability to post through NNTP might be good too.

    This would provide some of the listed features when using a suitable NNTP client; my own NNTP client called "bystand" supports SQL (it stores all data in a SQLite database; if you use the SQLite FTS extension then full text search is available), but other clients may have different features, so may be more suitable for your use possibly.

    (I think that all web forums and mailing lists should be NNTP instead. It is OK to have a web interface too as long as the NNTP server and message ID can be read from the web interface (even if JavaScript and CSS is disabled), so that if you are given links to the web interface then you can access the NNTP if you want to do. It would also be possible to provide a email interface too if wanted.)

  • ers35OP 6 years ago

    That reminds me of how https://forum.dlang.org works: https://github.com/CyberShadow/DFeed It also uses SQLite.

kwhitefoot 6 years ago

Sounds interesting. Here are a couple of extra ideas:

- an Android app embodying the same features,

- The ability to comment while offline.

andrei_says_ 6 years ago

All posts and comments I’ve upvoted and bookmarked are searchable.

floatingatoll 6 years ago

To clarify, why is “self-hosted” an important criteria to you?

  • ers35OP 6 years ago

    There have been many HN related websites posted over the years, but a lot end up as dead links. A self-hosted version does not depend on a third party. Another reason is to minimize the round trip latency of contacting a central server. Consider users without a good connection to the server.

    • floatingatoll 6 years ago

      What is your use case for this server that makes optimizing for LAN vs WAN latency a valuable outcome? I’ve never really noticed that latency in email clients or when using the HN website and so I’m curious what is unique to your specific scenario that makes it a priority.

      (This isn’t criticism, but I definitely don’t understand why it’s a criteria in your case yet.)

      • ers35OP 6 years ago

        HN itself is fast, but the comparison I'm making is with readers that use the HN API. See how getting each comment requires an additional API request: https://hacker-news.firebaseio.com/v0/item/22481199.json?pri...

        It's more about the consistency of operating on data local to the user. For example, see this comment referencing how HN paginates threads at 250 comments for performance reasons: https://news.ycombinator.com/item?id=22231055 A local database does not have that issue.

        • floatingatoll 6 years ago

          Ah, okay, so it's more about having to initiate thousands of requests for a single page than so much about the latency of any single request in those thousands — because even if they were all to a local server, that's still terribly inefficient, and with latency it's even worse. Thanks!

        • dang 6 years ago

          In case it's helpful: we're going to make a new HN API that will be much easier to use. The idea is that adding something like "json/" to any HN URL will return a JSON version of that page.

          Curiously enough, we should also be able to eliminate the pagination of comments by then as well. Both changes are waiting on some software work that we expect to improve performance significantly. I don't know when this will all be done, but I hope it's this year.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection