Zürich, Schweiz
I’ve been thinking about errors in Go quite a bit lately and what has been bothering me about the practice of error design and usage in the community. The critique starts with this code:
| |
It’s such a common piece of code that it has become a trope thanks to folks
feeling like the apparent verbosity is an affront to their humanity. The
verbosity’s not what I am going to be writing about today, nor is it
something I have frankly ever given two shits about. What I care more about
is the meaning of that code. Let’s contextualize that code again in func F:
| |
What really happens when someone calls func F and func F returns an error?
The answer is not all that complex: the error from func G is propagated to
func F’s caller. This might appear somewhat innocuous at first, but it may
not be all that innocent.
The Problem
Let’s reframe that code in Java1 to see what I mean. Suppose the code
starts out as this below where method f does not yet call method g:
| |
Then later the method f is to call method g. In order for that to work
correctly in the language2, the signature of f has to be adapted as
follows:
| |
Note: Method f’s throws declaration gained
DomainSpecificGException3 from method g.
That’s essentially what is happening in the Go code above with func F in the
example, and it invites an important question: is it appropriate for whatever
errors for func G returns to be also emitted by func F? The thing is,
there is no single answer to this question. The answer is situationally
specific and will have to be adjudicated each and every time you confront a
situation where an inward error could be propagated outwards.
Sounds a bit annoying, right? Well, it’s not entirely Go’s fault here; this same problem applies in other languages4, too. So what is this problem, then?
Ultimately it’s about error domains. func G may have a different domain
than func F, and it could be inappropriate for func F to return func G’s
errors unadulterated to its callers.
A good example of this is to imagine some layering like this in an application. Consider the case of a policy engine that determine the privileges of users based on records stored somewhere:
- policy engine:
(*policy.Authorizer).MayPerform(ctx context.Context, user string, p Privilege) (bool, error) - storage engine:
(*policystore.Store).LoadGrants(ctx context.Context, user string) ([]Grant, error) - database library:
(*sql.DB).ExecContext(ctx context.Context, query string, args ...any) (Result, error)
We can easily imagine method (*policy.Authorizer).MayPerform calling method
(*policystore.Store).LoadGrants to see types of permissions have been granted
to the user.
| |
We can also imagine method (*policystore.Store).LoadGrants loading these
permissions from a database using method (*sql.DB).ExecContext.
| |
So let’s imagine that the database library returns an error in that code flow.
Would it be appropriate for the storage engine to return the raw SQL error
from package sql to the policy engine? It’s likely not, because what is
package policy going to do with a specific error returned from package sql
by way of package policystorage? Not very much if it cares about maintaining
API stability5. Instead, the storage engine’s package
package policystore might have its own domain specific errors it would use to
translate the local errors from package db into.
Note: Go tends to be conducive to multiple design disciplines. That said, there is generally a value that extraneous abstraction is eschewed, and I am a bit subscriber of that value myself. It’s perfectly reasonable to imagine some Go developers authoring the policy engine without a policy storage engine in between — i.e., interacting directly with SQL.
To me, letting the policy engine accept multiple storage engines sounds like a reasonable design choice and not too much of premature abstraction. You might, for instance, want an in-memory or empty store for testing or for certain production run modes.
The point of saying this is that what I described is a rhetorical example, not a canonical solution. There are multiple ways of slicing the metaphorical onion.
The Solution
Let’s try reframing that code in Java again in case it helps illuminate a potential solution to the original problem:
| |
In this reworking, DomainSpecificGException is recoded to FException. This
same thing can be done in Go:
With error sentinels, it looks like this:
1 2 3 4 5 6 7 8 9 10 11func F() error { err := G() switch { case errors.Is(err, ErrSomethingDomainSpecificForG): return ErrFCondition case err == nil: return nil default: return fmt.Errorf("unknown: %v", err) } }With structured error values, it looks like this:
1 2 3 4 5 6 7 8 9 10 11 12func F() error { err := G() var errDomainSpecific *SomethingDomainSpecificForGError switch { case errors.As(err, &errDomainSpecific): return ErrFCondition case err == nil: return nil default: return fmt.Errorf("unknown: %v", err) } }With opaque errors:
1 2 3 4 5 6func F() error { if err := G(); err != nil { return fmt.Errorf("unknown: %v", err) } return nil }
Now, this invites a fun discussion that is beyond the scope of this article:
- when should we create an error domain?
- when should we convert one error domain to another?
I’ve leave you as a practitioner to answer6 these questions, but the error handling options are effectively this:
Propagate the error along the call chain:
return errMake the error domain opaque:
return fmt.Errorf("unknown: %v", err)Convert the error from one domain to another:
return ErrSentinelorreturn &MyError{ /* omitted */ }This includes possibly interrogating the original error value with
errors.Isanderrors.As.
There’s no one-size-fits-all approach. Propagating the error along is super convenient for prototyping, but production-grade code must consider what is correct on a case-by-case basis.
Error Wrapping
Now, the astute reader among you may wonder: why has error wrapping not been discussed yet? Well, the truth of the matter is that error wrapping presents essentially the same problems that the original code did:
| |
And that’s exactly what the blog post for error wrapping reminds the reader: a wrapped error becomes part of the outer function’s API.7
Closing Words
So, if your API does any of the following, you need to consider whether the returned error is situationally correct for the caller of your API to consume:
returning the error raw
1 2 3if err := G(); err != nil { return err }wrapping the error with extra text
1 2 3if err := G(); err != nil { return fmt.Errorf("something: %w", err) }wrapping the error inside another error type that implements
Unwrap() errorand similar1 2 3 4 5if err := G(); err != nil { return &MyError{ Err: err, } }where
*MyErrorimplements1 2 3 4 5interface { error Unwrap() error // or Unwrap() []error }
If your API does any of those things above, you need to consider Hyrum’s Law and all of its attendant worries. It’s usually easier to go from an opaque or local sentinel or structured error type to a propagated error value or type (be it wrapped or not) later than it is the other way around.
I hope nobody comes away from this thinking that I am against raw error propagation or error wrapping. The only thing I am against is doing either rotely (thoughtlessly). Just make sure you understand the context in which you are developing and the error contract of the APIs you are using, and document your contract correctly. This is especially critical in APIs that consume or call user-implemented interface values or function values.
The act of maintaining situational awareness and making an active decision while programming is the very nature of software engineering. There are some things you can delegate to an IDE or other facility or tool to help you design or solve a small problem in isolation, but the act of design and considering constraints in totality requires deliberate human choice. Nothing can take that away.
Navigation: