Using Pseudo-URIs with Microservices

62 points by pcalcado 9 years ago · 21 comments

Reader

What's wrong with using UUIDs for microservice architectures?

kuschku 9 years ago

As mentioned in the article, they make it hard to identify the type of resource.
- AdieuToLogic 9 years ago
  URN's[0] can support resource categorization so long as the Namespace Identifier (NID) does not collide with the reserved ones (and I'd throw in those which are well-known in the industry as well).
  So, for example, a URN which uses UUID's to satisfy uniqueness yet still allows for categorization could look like:
  urn:some-company:resource-type:abcdef012-3456-...
  For complete technical conformance, the "some-company" NID would need to be requested/registered from the IETF.
  In practice, my opinion is that the registration step is largely not required if the URN's are used strictly as an implementation detail within a system. However, those publicized and/or having a reasonable chance of being persisted by an external actor should have a sufficiently unique NID such that it can be registered and, more importantly, does not have a coincidental collision with other systems generating URN's.
  0 - https://tools.ietf.org/html/rfc2648
  - bpicolo 9 years ago
    
    Imo, this is a great combination of strategies when you want to keep people away from iterating. It also gives you the basis for a decent ACL system. Yeah, you use a few more bytes per item, but modern DBs can handle that.
    Probably worth having a autoincrementing ID on tables using uuids as well, though. Not 100% sure on that, but I've definitely hit batch jobs that need to run / frameworks that require autoincrementing integer IDs, and getting stuck is a pain. (Adding an autoincrementing key after the fact is doable, but at least on postgres involves a rewrite the the entire table, so you kill the heck out of replication)
    
    AdieuToLogic 9 years ago
    
    Probably worth having a autoincrementing ID on tables using uuids as well, though.
    Quite true, IMHO. I usually define both an auto-incrementing integral primary key (PK) as well as a unique string column for the URN. The PK is used for deletes/updates/joins/referential integrity and is never exposed beyond the server processes.
  - jamiethompson 9 years ago
    
    How would one best use such URNs in a REST API?
    An endpoint such as /products/:product-id:
    Would be something like?
    /products/some-company:products:abcdef012-3456...
    Seems a little clunky

bpicolo 9 years ago

> What you need is to provide a set of functions able to map between these formats and whatever optimal way you want to store them in

One important thing to remember with taking this approach is that you can work yourself into a huge bind if you store the mapped-versions in logs/other persistent storage and end up needing to change formats later. Say the original mapping version is based on 32 bit integers, or is cryptographically signed and your keys get exposed. If rolling the functions means you can no longer identify objects from logs / in databases, that's a huge problem.

That means things like access logs will no longer map to their respective current versions unless you keep around two separate encoder/decoders and are able to guess which version you need all the time.

It can become a big, unfun problem pretty fast.

pcalcadoOP 9 years ago

That's a great point, thanks.

NathanKP 9 years ago

Personally I've always favored a JSON HATEOS approach with my API's, where the JSON response contains the actual URL's of the resources referenced, not just URN. Example from one API that I maintain:

https://changelogs.md/api/recently-crawled/

One major advantage of this is that it allows clients to be built which have no hardcoded URL's at all (other than perhaps the top level URI that gives them the list of URL's that the API exposes). This allows the API maintainer to adjust resource paths retroactively, in addition to just exposing an API that is easy to explore.

DorothySim 9 years ago

> As we iterated on our approach, we have decided to follow more recent recommendations and not limit our identifiers to the deprecated concept of URN.

I was not aware URN was deprecated... Is there a reference somewhere to these recommendations?

dragonwriter 9 years ago

The use of the name URN in the broad sense of "a URI that specifies a name" is deprecated in favor of the general term URI (similar to the way that the term URL is deprecated in the same source) per RFC 3986.
The use of URN for a specific URI scheme that provides names, for which there is a global registry of namespaces to ensure uniqueness, etc.—which is what the article discusses—is in now way deprecated. The author seems to be ill-informed on the point which apparently is the only stated reason for not using the internet standard that directly applies to the use case.
- dragonwriter 9 years ago
  
  It occurs to me that there is another possible interpretation: the reference to URN being deprecated may have been a distorted reference to the fact that IETF has identified a number of issues with the URN spec and some existing registrations and other related issues and has an active workgroup and working draft on an updated spec. It's hard to tell if that's what was intended, though, since the dismissal was so terse; if so, a discussion of the issues with existing URN spec that are specifically problematic for the use in question would be nice, as would more description of why you aren't using actual URIs with a custom scheme rather than pseudo-URIs (since real URIs, whether URNs are not, mean that tools supporting the standards can be used, rather than building custom tooling and libraries for your almost-but-not-quite-URI setup.)
  Whatever the URN reference is intended to mean, this seems to be custom-over-standard with less clear justification than I would want for that choice.
  - pcalcadoOP 9 years ago
    
    Author here.
    The deprecation comment refers to this: https://tools.ietf.org/html/rfc3986#section-1.1.3 we had been using URNs and URLs the old way.
    But in any case the fact that we're having this conversation I had had to dig up some RFC from 2005 reflects the actual reason for not following any specific standard: I perceive them and their specifications to be confusing and full of historical context that has changed over time. Assuming that there are no other benefits to using the URN scheme specifically (maybe there are and I am not aware of them?) I'd rather use a simplified URI and custom schemes.
- DorothySim 9 years ago
  
  > The author seems to be ill-informed on the point which apparently is the only stated reason for not using the internet standard that directly applies to the use case.
  That's what I also suspected. Thanks!

awelynant 9 years ago

Curious if tag URI was considered https://en.m.wikipedia.org/wiki/Tag_URI_scheme

DorothySim 9 years ago

Is there a benefit of using tag URI instead of a regular old URL? E.g. tag:blogger.com,1999:blog-555 vs https://blogger.com/1999/blog-555 The only difference I see is that URL should point to something (can be referenced in a browser) which may or may not be an additional benefit.
- pacaro 9 years ago
  
  I worked on a project that used "regular old URL" just like you suggest, for contract and service identifiers, which needed to be human read/write/generatable
  Tag URIs would have been better because:
  a) not everyone owns a domain, but tags allow email address as authority
  b) it's confusing to many people to overload http URIs this way
  c) as a contract identifier the URI doesn't need to point to anything, but this creates cognitive dissonance — this is probably part of b)
  d) too damn long — tag URIs might suffer from this too. We were using these all over the place and there's no good way to truncate them
  - DorothySim 9 years ago
    
    a) is particularly interesting to me. I thought about giving people ability to create their own namespaces and used https://user.example.com or https://example.com/user as a namespace but tag URI looks cleaner.
    By the way why did you need human readable IDs? I'm asking out of curiosity because there is certain charm to just using UUIDs everywhere (and urn:uuid).
    
    pacaro 9 years ago
    
    There were some places, where only machines would interact with them, that we used urn:uuid. But we had UI, code, and log files etc, where developers needed to interact with them too
- dragonwriter 9 years ago
  
  > Is there a benefit of using tag URI instead of a regular old URL?
  You don't need to control a domain name to issue tags.
  Tags issued under an authority (domain name or not) are associated with a time, so remain valid and unique even if you abandon or lose the domain name, etc., providing the authority in the future.
  Because tag URIs are explicitly not a scheme that provides location information, the resource does not have to be accessible by a particular protocol for the tag to be meaningful and accurate, and the URI doesn't communicate false expectations to readers familiar with the scheme when the resource is not accessible. (And, for similar reasons, tags don't risk conflicting with actual locators in the future.)
zeveb 9 years ago

That looks like exactly the right solution to the author's problem.

Settings

Using Pseudo-URIs with Microservices

Keyboard Shortcuts