Settings

Theme

Reversing an integer hash function

taxicat1.github.io

87 points by flebron 4 years ago · 18 comments (17 loaded)

Reader

rurban 4 years ago

I'm collecting such inverters at https://github.com/rurban/smhasher/tree/inverse/inverse

But only 3 so far.

superjan 4 years ago

Is there a category of hash functions that hash a 64/32 bit input to exactly 64/32 bits output, such that all inputs are uniquely preserved? This could be an interesting property for a hash table of integers, because a hash match implies a key match.

  • ninkendo 4 years ago

    Even if you used such a bijective hash, your hash table would have to have 2^32 available buckets in order for the hash match to be all you need for lookup. Or 2^64 for 64-bit… which is why nobody really does this. (And if you did this, why even hash the input? The input integer could just be the key, and you’re basically using a really big sparse array.)

  • JD557 4 years ago

    I think the word you are looking for is "permutation".

    However, if you can use a permutation as a hash function, you might be fine with just using the identity function.

    • dragontamer 4 years ago

      > However, if you can use a permutation as a hash function, you might be fine with just using the identity function.

      This certainly isn't true for 8-bit characters using ASCII.

      The top-bit of all ASCII strings is always zero, its effectively a wasted bit. Permuting all ASCII characters to randomly use all 8-bits means a better distribution in virtually any hash-based data-structure. (ex: 64 slot hashtable will have fewer collisions after you permute the 7-bit ASCII into an 8-bit random permutation)

  • judofyr 4 years ago

    This would be a “bijective function” or a “perfect hash function”, although the latter is usually used when the input is much smaller than the output.

  • hinkley 4 years ago

    Linear congruential generators in some cases can do that, though more often they are used to convert n bits of input data into a uniform distribution over a range from 0-m. For instance simulating chance in a game (dice rolls, or % probability).

  • praptak 4 years ago

    Basically every cipher works like this.

  • namibj 4 years ago

    For 64bit: (triple)DES should do the trick.

  • omegalulw 4 years ago

    A hash is typically used in the context of mapping arbitrary size input to fixed size output.

  • Retr0id 4 years ago

    XXHASH64 is bijective for 64-bit input strings.

blastonico 4 years ago

> otherwise you could simply trace backwards and generate an input that produces a specific hash

This is when I know for sure that I'm not included in that "you".

clon 4 years ago

Excellent tutorial for bitwise arithmetic this is. The key is the motivation you receive from the prospect of being able to do something that seems "l33t".

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection