Settings

Theme

Note the commit hash

github.com

86 points by Xlab 12 years ago · 31 comments

Reader

avar 12 years ago

Here's a patch I submitted purely for discussion to add a feature to Git to allow for vanity commit hashes back in 2011, it generated some interesting discussion: http://lists-archives.com/git/756392-choosing-the-sha1-prefi...

Unlike the linked it didn't alter the commit date, but altered the Git code itself to add a new custom header to the commit object.

Git handles custom headers just fine since it ignores unknown headers for future compatibility. This commit still lives in the main Git repo at work without any issues:

    commit 313375d995e6f8b7773c6ed1ee165e5a9e15690b
    tree c9bebc99c05dfe61cccf02ebdf442945c8ff8b3c
    parent 0dce2d45a79d26a593f0e12301cdfeb7eb23c17a
    author Ævar Arnfjörð Bjarmason <avar@booking.com> 1319042708 +0200
    committer Ævar Arnfjörð Bjarmason <avar@booking.com> 1319042708 +0200
    lulz 697889
  • syncerr 12 years ago

    Testing it out:

      > git commit -F message
      Try 0/4000000 to get a 1337 commit = 3650e08c9e1ecbbeec83daf7a959e3edcf15bd4f
      Try 100000/4000000 to get a 1337 commit = 3952f7d5035f5e88f66aa5c70e5cc11fdd734852
      Try 200000/4000000 to get a 1337 commit = 51c910f5d535c515a04796eb7c7a70cbd2325599
      Try 300000/4000000 to get a 1337 commit = d70c3e64b1d963461a6ee2f518c613483b979d68
      ...
      commit id = 313378458f8c4fb53c808f4b0bae5bf71ba5e23b
      [master 3133784] 1337 Test Commit
       1 file changed, 60 insertions(+), 35 deletions(-)
    
    https://github.com/spence/git/commit/313378458f8c4fb53c808f4...
    • avar 12 years ago

      Right, and now you can use "git show --pretty=raw" to see the commit header:

          $ git show --pretty=raw 313378458f8c4fb53c808f4b0bae5bf71ba5e23b | head -n 10
          commit 313378458f8c4fb53c808f4b0bae5bf71ba5e23b
          tree 7e93df01bfc9c187d58a0b96e756dd8ac0031c82
          parent e4eef26d985177e4bdd32bf58b6ae40e7ae67289
          author Spencer Creasey <screasey@monetate.com> 1396872901 -0400
          committer Spencer Creasey <screasey@monetate.com> 1396872901 -0400
          lulz 843475
          
              1337 Test Commit
              
              http://lists-archives.com/git/756394-choosing-the-sha1-prefix-of-your-commits.html
      
      There's replies in that thread where the naïve technique I was using was improved a lot.
poopicus 12 years ago

If you're wondering how he did it, Brad wrote a tool called 'gitbrute'[1] that (as the name suggests) brute-forces the prefix of the hash to whatever you like.

[1]: https://github.com/bradfitz/gitbrute

  • lambda 12 years ago

    This is a fairly popular thing to do with bitcoin addresses as well; keep generating keys until you get one with a recognizable prefix in the address.

    • adrianN 12 years ago

      Tor hidden services also do this. (The address of a hidden service is a hash)

      • gwern 12 years ago

        Not just for vanity, either - it makes phishing a hidden service harder: if users know the .onion for Agora starts with 'agora', then a phisher has to invest weeks of compute-time just to get a plausible .onion to start his phish with, rather generate than any old .onion in a millisecond.

    • Omnipresent 12 years ago

      Neat. I remember Stripe did something similar with their CTF challenge

      • pushrax 12 years ago

        Yep! There were a bunch of us who wrote GPU "miners" (mostly OpenCL). I was intending to open-source mine at one point but never got around to it.

        At the end of the challenge, the network was hashing ten digits (i.e. 000000000 prefix) in just a few seconds. Here's one of the rounds: https://github.com/pushrax/round660/commits/master

  • spb 12 years ago

    I wrote this a little over a year ago, only (for the lulz) in Node rather than Go:

    https://github.com/stuartpb/lhc

    Mine allows using a custom word list in the commit message for the nonce.

    Judging by this commit, I'm guessing gitbrute uses miniscule variations in the commit time instead.

    EDIT: yep: https://github.com/bradfitz/gitbrute/blob/master/gitbrute.go

rspeer 12 years ago

It is almost certain that there are a quarter-billion published Git commits by now. (GitHub says there were 150 million pushes in 2013 alone, and many of those have multiple commits.)

This means it is likely that, purely by coincidence, someone has at some point had their commit labeled as (badc0de).

  • gwern 12 years ago

    Github doesn't seem to allow searching revision hashes, but presumably Google has most of Github crawled by now, and I didn't spot any commits with that as a label: https://encrypted.google.com/search?num=100&q=badc0de%20site... (more hits than I expected though).

    • tlrobinson 12 years ago

      Google doesn't appear to match partial hashes. I searched for a random GitHub commit hash, and it showed up, then removed one character, and it didn't.

    • maxerickson 12 years ago

      If you switch to verbatim there are only about a dozen.

      I think the odds of an accidental collision with badc0de are pretty low.

      • sexmonad 12 years ago

        Each hex character represents 4 bits. That means that a 7 character string is 28 bits. That's about 268 million possibilities. On average, it would take around 134 million commits to get one that started with "badc0de".

  • Strilanc 12 years ago

    Interesting

        1 - (1 - 1/16^7)^250000000 ~= 61%
cdelsolar 12 years ago

I thought the commit hashes had to do with the actual code being checked in. There's a random component to them as well?

  • bri3d 12 years ago

    Not random.

    A Git commit is one kind of "object" in git. Objects in Git are hashed like so:

    SHA1("[objecttype] [objectlen]\0[objectdata]")

    and a commit object looks like this (blatantly stolen from the Stripe V3 CTF):

        tree #{tree}
        parent #{parent}
        author CTF user <me@example.com> #{timestamp} +0000
        committer CTF user <me@example.com> #{timestamp} +0000
    
        Give me a Gitcoin
    
    The "tree" in the commit is the hash tree reference that actually points to your code.
  • ben0x539 12 years ago

    You can probably cycle through a couple million milliseconds in the various timestamps involved to get a large selection of hashes to pick from without making your commit stand out.

    • nhaehnle 12 years ago

      Iterating through the author timestamp gives you a bit more than 16 bits for one day. Then you may wonder how much freedom you want to take with the committed timestamp; it should be later than the author timestamp, but if it's off by a few hours to a day it might still be fine, which gives you another ~14 bits, and you're already at 30 bits. Throw in flippable punctuation (i.e., in the commit message, do you write deadbeef as a single word or not? Do you add a full stop or not?) and perhaps some other variations of the commit message and file contents, and 32 bits of nonce is actually fairly easy to get.

      It's still a very cool demonstration of why you really need to compare every single bit of your hashes.

  • havardk 12 years ago

    No, but the hash also covers the commit message and meta data. Here, it was fiddled with the commit and author date for the desired effect.

    Talk about how it was done, and a pointer to the code that was used: https://plus.google.com/115863474911002159675/posts/RT2Tvb1w...

  • rweir 12 years ago

    ?

    they're hashes of the /commit/, which is the code + metadata including time + previous commit(s).

mschuster91 12 years ago

Is this something like the Bitcoin vanity addresses? Partial sha1 collision?

chetanahuja 12 years ago

And thus began, gitcoin.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection