Ask HN: Resources for image pattern recognition algorithms
I am interested in extracting a set of objects from a large, noisy image. For instance: if I had a high-resolution picture of a seating section at a stadium, I might want to extract each baseball cap into individual images given a second picture of a prototypical baseball cap.
There is obviously a lot of research published on this subject (by subject I mean image pattern recognition as a whole). I've been pouring through the IEEE, as well as just searching via Google for a couple of days.
My problem is that this is so far from my field of expertise, I feel like I'm wasting an awful lot of time reading research papers that are not applicable or are outdated.
So my question is, if anyone has any experience in this field or a related field, are there a set of known problem definitions I could narrow down my research into? What about known classes of algorithms (and by that I mean more specific than "machine learning" or "neural networks")?
Any advice is mightily appreciated. You are right you're wasting time. Image processing is not the sort of a thing you can pick up on your own "to get work done"; it's extremely demanding and actually fun. Just hire someone. If you're prone to falling into "hack mode", learning this stuff will not help you one bit. The problems are far too interesting and encompass wide areas of research that are guaranteed to please everybody; low level bit manipulation, file formats, numerical methods, signal processing stuff with filters and sampling, wavelets, edge detection, rank, laplacians, convolution, dithering, ray tracing, morphology, neural networks and other genetic learning algorithms, heaps of inner-loop and vector optimization, scene detection stuff that make use of ray tracing plus interesting data structures like octrees, statistics, information theory .. in a nutshell, it's something to give up work, wife and kids for. Don't ever let curiosity drag you into that tar pit, hire someone. I could consult our very own pixcavator:
http://news.ycombinator.com/user?id=pixcavator http://inperc.com/wiki/index.php?title=User%27s_introduction I wish I could give a "positive" advice but best I can say is that what you describe is a tough, tough problem. There is no off-the-shelf solution and if you are new at this, you are up for a lot of pain with a slim chance of success. Sorry, can’t be more helpful. I suppose it depends on the definition of "success". Learning about the state of the art and writing some enlightening code, no matter its ultimate usefulness, would be plenty successful for me. I'm mostly surprised at the lack of papers I'm able to find that have been published after the mid '90s. I will give major kudos to you though, sir. I've spent the last hour or so reading through the wiki. Please, keep up the great work. I am quite interested to learn more about object recognition as well, unfortunately I don't have any direct answers to your question. May I suggest your search for "Content Based Image Retrieval" for research papers in that field. Yes, I've done a bit of searching on CBIR actually. There are a few papers out there about querying at multiple resolutions (like http://grail.cs.washington.edu/projects/query/mrquery.pdf), which will definitely be helpful. The next step, for me, is to use a single image as the set to be queried, without knowing at what scale the target objects appear in the image. I am not a vision specialist, but I work around a lot of them. Look into the OpenCV image processing library. I haven't used it, but it seems to implement a lot of basic functionality to get you off the ground. If you can't find anything good from the last ten years, then look at Yann LeCun's recent papers. (Google wanted him to be head of research, but he preferred academia.) In particular, investigate his convolutional networks. The work of Rob Fergus is more applied, and should lead to good recent pointers. Look for works experimenting with the NORB dataset. You might want to check out open cv library. They have this thing called the haar classifier/training. You train it to recognize an object and then it can look for that object in other images. Here is one example how it was used to recognize sign language. http://sandarenu.blogspot.com/2008/06/opencv-computer-vision...