Ask HN: Simple projects to implement machine learning
I have been working through Tom Mitchell's book on machine learning and I want to supplement the theory with a practical project.
Does anyone have experience with small projects which helped them in a similar situation? The most I have come up with is building a news classifier which will use the feedzilla and NY Times api but if anyone has other good ideas please let me know.
I have reasonable programming experience with python and have access to a linode so I will be using these to implement the project.
Thank you I used lua to create a simple AI. This AI can learn words and store them in a dictionary. It can link one word with another, having some resemblance or relationship. Also I implemented a "emotional link" between them; one word can be either in a GOOD or BAD relationship with another. Like the word "rat" will have a BAD relationship with "cat". It is based on meaning of words, rather than acting like a mere processor of vocabulary crawler!
This AI can communicate with me.. asks me questions that it generates from meaningful concepts. Ofcourse, it is very basic right now and makes errors. But it will grow.
The bottomline is that you must implement "from the roots". Perception and recognition are built over these roots. Hope you got on idea what I am attempting to build.
One thing most required right now is adding a solid grammer engine to it. If I were not a lazy programmer, I would have done it already. :D I built an image interpretation app once that was really interesting from a machine learning perspective. Using image magick I extracted shapes and based on the layout I was able to feed that into a "composition" algorithm that judged whether the image layout was good or not. I also implemented a "plant detection" feature to find out whether a plant was in the image or not. I think images is an area that could still benefit a lot from machine learning, i.e. facial recognition, image search (extracting meaning from images). thats just some of my ideas. machine learning is awesome I wish you all the best!! Thank you, that sounds very interesting. I assume you would use known images with plants in them as the training set. How about creating a recommendation engine for HN articles? It should recommend articles upvoted by users that have similar likes/dislikes as me. I would also like to browse HN articles/comments on basis of topics. For example, right now if I want to find HN posts related to "machine learning" I use HN Search. Can you make it better by using machine learning techniques so that I can get all relevant HN articles on a given topic sortable on basis of upvotes and time? Thank you, that sounds like a good idea. I may start looking into the HN api. I have been working to build a search engine for reddit using their api for a week now (not very mature, I have only started to build a database of tags for the posts) but I believe this idea will work here as well. May I know which books/courses/blogs you used to learn machine learning? I'm learning it myself and I have just finished reading "Programming Collective Intelligence". Sure, I started off with Andrew Ng's course on coursera. Then I started with the book called Machine Learning by Tom Mitchell. I also have the PCI book to supplement Mitchell's book with code examples. I got Bishop's book too but to be honest I'm finding it a little harder to follow than the others. I'm almost through Andrew Ng's course. Did you do all the programming activities to reinforce the lectures? I've been keeping pace with the lectures, but hadn't done the homework/programming. Now I'm going back and completing them one-by-one. I was thinking about watching Tom Mitchell's CMU course online next. Have you checked that out? PS. Is it just me or is Andrew Ng incredible? I thought I had good professors in college, but he is on another level. No I am exactly in the same place. I missed the programming activities (I was going to use numpy instead of octave and it turned out to be too much of an effort). I have re-enrolled in the course going on now so I am also going to do the assignments now. Also thanks for mentioning Tom Mitchell's course, I didn't know about it. I will be sure to check it out. And yes I agree Andrew is a great teacher. It became even more obvious when I tried to read up on areas he had not covered in the course. How about analyzing a database of approx 400k car parts requests? Analyze key words, year-make-model, geographic location, etc? I'm currently doing this to generate a chloropleth map for PartsLine.com. https://skitch.com/tzmartin/eydmm/partsline.com-rfq-to-fips-... Why not sign-up to kaggle and do some of their challenges, that way you can also benchmark yourself against others. Thanks for the suggestion, I actually did sign up to kaggle a couple of days ago and am working on some of their problems. I was looking for something more internet related as getting in data from the internet and grouping or ranking it feels more immediately fun to me.