Settings

Theme

Creating Shazam in Java

redcode.nl

295 points by freefrag 13 years ago · 48 comments

Reader

dang 2 years ago

https://web.archive.org/web/20101005164234/http://www.redcod...

willvarfar 13 years ago

Update: I posted a link to some source code that implements the Shazam algorithm:

https://news.ycombinator.com/item?id=5724442

About the patent lawsuit thing:

As I understand it, Shazam sold their patent to Landmark Digital Services, which are a part of BMI the record label. They kept an exclusive license to make Shazam-like software for phones.

You can imagine BMI wanting it to make money from how a service such as Youtube fingerprints and detects copyright infringement...

And it was this BMI company that were trying to get this blog post explaining the patented algorithm removed from the internet.

One post from the BMI lawyers to Roy in the Netherlands was particularly broad bullying:

> Mr. Van Rijn,

> The two example patent numbers that I sent you are U.S. patents, but each of these patents has also been filed as patent applications in the Netherlands. Also, as I'm sure you are aware, your blogpost may be viewed internationally. As a result, you may contribute to someone infringing our patents in any part of the world.

> While we trust your good intentions, yes, we would like you to refrain from releasing the code at all and to remove the blogpost explaining the algorithm.

> Thank you for your understanding.

> Best regards,

> Darren

> P. Briggs

> Vice President &

> Chief Technical Officer

> Landmark Digital Services, LLC

Roy gave a great talk at Devox about this: http://www.redcode.nl/blog/2012/03/devoxx-2011-talk-freely-a...

I think I heard that Shazam recently got the patent back. I speculate BMI found no-one to license their fingerprinting tech for copyright infringement.

  • raverbashing 13 years ago

    " you may contribute to someone infringing our patents in any part of the world."

    Oh really? What a dolt.

    Patents are, by definition, public. (not before they are accepted, though)

    • Nitramp 13 years ago

          Patents are, by definition, public.
      
      That's actually the telling sign of the dysfunctional patent system. Companies want to use patents to prevent everybody else from doing something similar, and in this case, even from just talking about it (which is obviously ridiculous).

      Patents used to be a framework for sharing technological progress without giving up ownership, i.e. make it easier for everybody else to build on other's progress - that's long gone.

ww520 13 years ago

This is very cool. Minimum clear implementation of the algorithm that replicates the effect of Shazam. It's refreshing to see a blog with actual code sample got voted up instead of all the press releases.

genevoronkov 13 years ago

I mirrored this implementation a while ago since the full source isn't available. It was not nearly as successful as the blogger portrays. For example, if I used a high quality wav mono file to create a fingerprint it would have a hard time identifying a track that is an mp3. It seems the maximums actually get shifted and merged from compression. In other words there's a reason shazam uses entropy based anchor points to help it pick hashing values.

  • bmohlenhoff 13 years ago

    I'm wondering if they bound the fingerprint search to human audible frequencies. MP3 compression, as a lossy codec, works by discarding information in the input signal that corresponds to inaudible frequencies. I believe this could be mirrored in the implementation by running the frequency domain peak-pick algorithm only over specific bin ranges.

    • genevoronkov 13 years ago

      I don't recall if the paper specifies the frequency ranges used but my implementation was bound to audible frequencies. I was going to use hill climbing search to find optimal frequency ranges but came to the conclusion my implementation was too flawed regardless. If I looked at the two graphs side by side(compressed vs uncompressed) they looked nothing alike. For example, the peak might be in the same region but it would be shifted.

devingoldfish 13 years ago

For those interested in more about the algorithm, one of the guys who created Shazam released a whitepaper on it. http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf

bmohlenhoff 13 years ago

After using Shazam, I was kind of hoping there was more to it than just a time windowed frequency domain peak-pick algorithm. The algorithm itself is pretty basic from a signal processing perspective, but I think the key insight here was that the results are unique enough to store off and compare other samples against at some later point in time.

  • regularfry 13 years ago

    Yeah, the magic (if there is any) is doing the match across a silly amount of songs in a relatively short time. Not groundbreaking exactly, but operationally quite interesting.

    • Jamiecon 13 years ago

      I actually remember first using this by dialling 2580 about 10 years ago. At the time it felt truly magical.

  • jessedhillon 13 years ago

    Are there other uses for this algorithm/technique, when applied to signals other than audio? I mean apart from identifying a source from a small clip.

    • bmohlenhoff 13 years ago

      This type of analysis is commonly used in tons of things, like communications systems, image processing, radar, etc. I used a similar technique when trying to identify an underutilized wifi channel in the vicinity of my apartment.

    • IanChiles 13 years ago

      IIRC there have been a number of papers on using a similar technique with speech to text applications.

  • kenshiro_o 13 years ago

    Well I'm sure they must be using a few tricks in their implementation. I've always been interested in knowing how Shazam actually works and had in mind that they must somehow split a song in intervals and "hash" every interval, then store them in some kind of indexed database for fast retrieval. Seems I was not too far off:)

  • widdershins 13 years ago

    Yeah, this is the obvious implementation. As he said in his follow-up post:

    >And second, I’d like to know which patents are in play. Because I just couldn’t think that something this easy (music-fingerprint is a hash, and we do a lookup) can be patented.. Maybe in the States, but in Europe?

johnx123-up 13 years ago

Can someone please compare it to other fingerprinting approaches http://en.wikipedia.org/wiki/Acoustic_fingerprint ?

dsirijus 13 years ago

So, the patent infringement story ended up with "Good luck."?

raverbashing 13 years ago

This is interesting

I wonder how the work is split between client/server in (actual) Shazam. (I suppose only the key points are sent to the server, but I may be wrong - Siri for example sends the server a compressed audio file of the recorded sound)

  • regularfry 13 years ago

    You can phone Shazam up and have it make the identification using what it hears live. There's no client processing at all for that.

normalfaults 13 years ago

Google Cache: https://webcache.googleusercontent.com/search?q=cache:http:/...

_smaugh 13 years ago

First time I used Shazam, was so amazed. Had to download the original paper, still couldn't understand well enough how it worked, in order to code it. now lets get to work on it.

Great article, thank you

coob 13 years ago

Good article, title should have [2010] in it.

jordan_clark 13 years ago

One possible way to solve the legal troubles is to just remove any references to the product name 'Shazam'. You could title the blog post "Algorithm in Java that identifies music similar to other commercial products" (too long.. but use your imagination)

  • jerf 13 years ago

    That wouldn't do a thing. Patents cover the code, not the names. (Well, the "embodiment", but since all there is here is code, it is clearly covering the code.) That would only help a trademark infringement, and there isn't one here.

zerr 13 years ago

From where Shazam gets its content - fingerprint database?

I mean, did they bought/rent mp3's?

  • corford 13 years ago

    I haven't had a chance to google for a source so take this as anecdotal but I vaguely remember reading an interview with the people behind it (when Shazam first launched in the UK) and in it they said they were ripping thousands of CDs a day/week (can't remember which) and running each track through their algo. Can't remember if they bought the CDs or had some deal in place with the record labels.

    • zerr 13 years ago

      Interesting. So as with many other "innovative" startups - the content is the crucial thing.

      As already pointed here, audio fingerprinting is not a new thing. Although, they might have added some twists in order to were able to patent it.

zayd 13 years ago

We had a 'build your own Shazam' as a lab for Berkeley's Intro. Signals & Systems class this semester. Super cool to see it working and quite an interesting application of Signals & Systems

datashaman 13 years ago

http://boingboing.net/2010/07/08/patent-holders-legal.html

chuable 13 years ago

What ever happened to the "patent infringement" issue?

rhapsodyv 13 years ago

Is there any code changes that you can make to not conflict with the patent?

tygv5ug 13 years ago

For the first time I'm surprised that one of the first comments isn't "why was it written in Java, bla, bla bla". Those were getting really annoying.

ilanco 13 years ago

I wonder if they would have bothered you if you had named the post: "Creating Google Ears in Java"

vitorarins 13 years ago

Who made that article could have said what external libraries did (s)he use.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection