Ask HN: anyone interested to build a hacker news with tags?
Could anyone help me build a hacker news with tags? I am asking only those who are interested to have it as well because I only have a budget for the hosting for this.
The point is to be able to search through the whole archive using tags/keywords.
example of tags:
'security'
'crm'
'a/b testing'
'optimization'
'http', 'ssl', 'domain name'
'scala', 'c++', 'php', etc
'lua'
'sql'
'marketing'
'website'
'landing page'
=> get all posts that relate to each tag (and combinations of tags) sorted by points of individual posts/comments.
To do list: 1. import all hacker news database 2. insert in database all tags for all posts/comments, using an algorithm similar to the Kaggle Keyword Extraction algo (https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction), which will need to be refined. 3. create great user interface to the new database
-------
or if no-one has the time, could anyone advise me on how to download the whole hacker news database? 1. You can download the dataset using http://hn.algolia.com/api. Mind the rate-limits, though. 2. This has already been done quite a few times by various apps, most prominently here: http://algorithmia.com/demo/hn (http://blog.algorithmia.com/post/86295023534/algorithmic-tag...) http://hn.algolia.com/api Thanks. How many requests would that require ? http://algorithmia.com/demo/hn does not work for me. The first link states at the bottom: "We are limiting the number of API requests from a single IP to 10,000 per hour. " The hn/tag demo works one time out of five refreshes for me, so keep trying. Here's a screenshot in case that doesn't work: http://imgur.com/yPF0hkn Thank you. I appreciate the screenshot. Interesting; I get the feeling the tag algorithm is crucial to make it work. Not sure it's such a clear no-brainer as I first imagined it... See the new API too: Can't you just search the keywords? I wonder how useful it would be given that the information (tech articles, such as rails2.1, best features in jQuery1.0,...) will be out-of-date as time goes. I think what's useful is various tools if they are still alive. That's why I want to build a toolbox which collects all the useful tools. your idea might be better. I could help if you need support. your contact? check my profile for twit id hi, you can email me at: timothee . henry @ gmail.com