How we used Go 1.18 generics when designing our Identifiers

10 points by DomBlack 4 years ago · 3 comments

Reader

jerf 4 years ago

This isn't a bad solution. It's probably what I would do.

However, it's still worth addressing a bit of an issue I see in a lot of developer's heads, which is that they get so stuck on using one solution, even if it doesn't work in their language, that they forget about other ones. Prior to generics, the solution to this wouldn't have been to throw your hands up in the air and cry that there was no way to get the id generation code in one place. The solution would have been:

    func serializeXID(xID xid.ID, prefix string) string {
        return fmt.Sprintf("%s_%s", prefix, xID.String())
    }

    type MySpecificID struct {
        id xid.ID

        // whatever else we need
    }

    func (msID MySpecificID) String() { return serializeXID("mytype", msID.id) }

It is perfectly valid, in any language, to simply use functions to refactor commonly-used bits of functionality into functions. Go still has plenty of other places where this will be necessary for other reasons. (Go is far from a "functional language", but, at the same time, it kinda borrows that "simple functions are really really important" idea, and even post-generics, it doesn't have a bajillion things that boil down to functions wrapped in some language concept. It mostly just has functions. As the FP languages show, this is still pretty useful.)

As boilerplate goes, if you look at it, it isn't even that much more. Interfaces can still be used to ensure that the correct methods are guaranteed to be implemented. (In practice, such a basic interface can't be missed, and even if it is for some period of time, well, first, you learned something about your program you may not have expected, and second, the compiler will tell you exactly what the problem is.)

In completely other news, 'fmt.Sprintf("%s_%s", x, y)' is a dangerous pattern that I've been bitten by other developers deploying before. Underscores are too common in identifiers. In this particular case one of the identifiers is a nasty string that probably can't be mistaken but it still reflexively makes me nervous to see it, and there will still be some tricky considerations around splitting on the proper underscore if a resource type ends up with an underscore in it. You really ought to either use a character that can't be used in a resourceID... and check that... or escape the prefix so it's unambiguous. You can any of the many existing mechanisms for doing that that may work, or it can be as simple as replacing backslash with two backslashes and then underscore with backslash underscore. Then it will be possible to unambiguously split the resulting id.

DomBlackOP 4 years ago

I agree; there are entirely different ways to achieve this same goal without generics. I mentioned using a struct and including the data from the struct in the post. However, what I wanted to show within the post is how generics can make this problem easier for us.
> As boilerplate goes, if you look at it, it isn't even that much more. Interfaces can still be used to ensure that the correct methods are guaranteed to be implemented.
Yeap, we implemented many different interfaces on our ID types, and the resource type also implements a couple of interfaces (such as methods to get the Postgres type information). Unfortunately, this means we still need to create receiver methods for each of these interfaces so we can use the ID types with those underlying libraries (such as `JSON` or the `SQL` scanners) which ends up being 14 methods per resource type (and we have quite a number of resource types). We would have ended up code generating these, and generics just saved us that work.
> Underscores are too common in identifiers.
We picked underscores, as they are not included in the base32 encoding of XID. They're also safe to put in URLs without needing to perform any encoding. Our static analysis tools during CI check that all code generation has been completed and prevent the prefixes from having underscores. That means we don't need to worry about escaping as we can always unambiguously split it. However, even if your prefixes allowed underscores, you can always just split the string at the last underscore; as the XID portion of the string cannot contain it.
I'd be interested to know how you were bitten by it in the past?
- jerf 4 years ago
  
  For context, like I said, your solution is good and I probably would use something like it, perhaps I even will in the future. Where my comments come from is that as I read people's comments around using Go generics now that they are here now, I'm beginning to wonder if a great deal of the mismatch between the Go community's attitudes towards them and the external community's attitudes towards them is that a lot of people bounced off of X thinking "it just can't be done", when in fact it could, perhaps just not as well, but not necessarily "not as well" as people thought, either.
  I don't have 14 instances, but I do have a few places where I have 6 or 7, and they mostly ended up copy&pasted blocks, and I sweat about that a lot less than I used to because in the end the question isn't whether you end up with repeated code at scale but how. Your mileage may legitimately vary. My larger concern is more trying to figure out how to help people understand how to do things in general, because even if the technique isn't useful enough here it's still the answer to a lot of Go problems. (And, for that matter, a lot of FP problems. They use a lot of "just functions" over there.)
  "I'd be interested to know how you were bitten by it in the past?"
  I mean it as a general warning, because your specific case is fine.
  But I've been bitten multiple times by people thinking this is a great encoding scheme to use in key/value stores where they also use underscores in the values being encoded. Bear in mind the person writing the "%s_%s" may not be the person choosing the other values being used as keys, and may have simply been using the key/value store as a magic box that stored things for years.
  So if someone writes some code and the key "msg_value_by_author"... where exactly does it break down? Even as a human, stripped of context, you don't really know. The real fun is when multiple bits of the code decide "oh, it's just a string, I can parse this myself, we don't need library code" and then of course the bit of the system that sent in "msg_value" writes a workaround for their own parsing, that other bits of the system don't get.
  Is it a mess? Heck yes. Should you not write that mess in the first place? Of course. Has it at times been a persistent thorn in my side anyhow? Oh yes.
  As you have it written now, no, you really shouldn't have a problem, and even if you do, it'll be brief. But it's still good to know you've got code that is, shall we say, "problem adjacent" and it doesn't take much for an intern to come in, see that, and make a "small tweak" that turns out to be a long-term problem. And also, I'd like to put that idea out there for other people so they can also keep an eye on it.
  If you instead consider the format string as an encoding scheme into which you must encode the two values in an unambiguous representation, and it so happens that the XIDs are always unambiguously encoded by performing the null transform on them, you're better off then thinking about them as strings that can always be separated by a random other string safely. Done correctly, this is quite safe and you can jam quite a few strings into something like a Redis key without having to worry about what they may contain.

Settings

How we used Go 1.18 generics when designing our Identifiers

Keyboard Shortcuts