Settings

Theme

Ask HN: How should I create a unique id for entries that aren't incremental?

2 points by tim_nuwin 11 years ago · 14 comments · 1 min read


For example, right now when I'm creating boards (agile), it will create a new board and its id will be n + 1.

What is an efficient way of creating an ID where there won't be any collision even if there are 1 billion+ entries?

This ID will be used in a url..

Thanks, Tim

smt88 11 years ago

"where there won't be any collision even if there are 1 billion+ entries"

This is a really complicated topic, and there are multiple ways to handle what you're doing. It really depends on your read/write ratios, typical volume, growth rate, and the underlying DB software you're using.

Because there are so many considerations that require knowing real-world use cases, it's a premature optimization. Are you going to have more than 1 billion records in the next few years? If not, don't worry about this.

However, there are other reasons to use non-incremental IDs (security, for one).

To answer your question as asked though, check this out: http://www.postgresql.org/docs/8.3/static/datatype-uuid.html

  • tim_nuwinOP 11 years ago

    Hmm, well right now users are complaining it's too easy to view other people's boards because the url is https://www.taskfort.com/view/10

    The only way to not view a person's board is if it's private.. There are some services that for their pages will have id's that are 7 or so characters long, and very compact, the uuid you're referencing seems kind of ugly.

    I would still keep my incremental ID in the table as a PK, but maybe I could generate a new value per row for a public URL ID. That public url id could be based off of their PK but I don't know what would be the best way to generate a short url id w/ the PK as a key.

  • iancarroll 11 years ago

    > However, there are other reasons to use non-incremental IDs (security, for one).

    That's just security by obscurity, with proper authorization checking it doesn't matter.

    • smt88 11 years ago

      Security doesn't always mean "seeing something you're not supposed to see". He's saying that the boards are public, so people are able to just change the number at the end of the URL to find them all.

      You can have the same issue with scrapers. It's much easier for scrapers to get all your pages if you use sequential numbers for unique IDs.

      Yes, a search engine could index the pages, but the big engines will obey your robots.txt, and the small engines will never know that you exist most likely.

      So s/he's not trying to "secure" anything as much as just hide it.

    • dagw 11 years ago

      Security by obscurity isn't a bad practice, it just shouldn't be your main practice. Absolutely design your system with the assumption that the attacker has complete access to all information about your setup, but it's still reasonable to try to obscure as many of those details as possible. Your setup will have flaws and you will make mistakes, so you want to try to minimize the damage those mistakes might cause and increase the time/effort needed to exploit them.

iancarroll 11 years ago

Incremental IDs work best, but if you want you can hash a UUID which will work for your use case:

% uuidgen

B14818B6-4219-43BD-82EF-8421EC1AFBCF

% echo "B14818B6-4219-43BD-82EF-8421EC1AFBCF" | shasum -a 256

00ea501d47789ac5eb559f10d631b3f6df8f82b5cba9c1f9d234b705d89f1704

  • tim_nuwinOP 11 years ago

    Those urls are kind of ugly. If this helps at all, maybe I could create a public url id based off the incremental PK id's.

sjs382 11 years ago

http://hashids.org/

Rainb 11 years ago

How about hashing the incremental? Now I wonder how ids like imgur or youtube work.

  • Jeremy1026 11 years ago

    Base62 encoded incremental IDs

    • smt88 11 years ago

      That's still incremental though. It just looks different.

      • Jeremy1026 11 years ago

        bigint gives you 9.2 quintillion options before you run into a collision. Which is obviously not forever proofed, but certainly future proofed.

        • smt88 11 years ago

          OP isn't asking for a data type that can hold more than a billion records.

          OP needs a way to generate a random ID without checking that the ID has already been used -- s/he wants a UUID. Bigint is FAR too small to do that a billion times without a collision.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection