Don’t push ifs up, put them as close to the source of data as possible

In late 2023, Alex Kladov published Push ifs up and fors down which essentially says the following (correct) statement:

If there’s an if condition inside a function, consider if it could be moved to the caller instead

Great advice, and I thoroughly agree. I’d even make a stronger argument to encode the check into the type itself. For example, if a number cannot be less than zero by the time it reaches a function, your type system should be able to prove it. But I digress.

The advice from Alex really only considers data that is entering a function, and not data that is exiting. If your function fetches data in some way, via a call to a database or some other external service, then you want to push your if checks down, not up.

Consider the following contrived example that makes a GET call over HTTP to presumably get some notion of a Student from a REST service.

public async Task<HttpResponseMessage> FetchStudentById(Guid studentId)
{
    return await _httpClient.GetAsync($"{studentsUrl}/{studentId}");
}

Now every caller that calls this method needs to:

Check that the response indicates success, and
deserialize the data into a stronger type

Now, code like this doesn’t really exist. Programmers often realize this issues ahead of time. The only problem is that they often inadequately address them. Consider this slightly more realistic scenario:

public async Task<Student?> FetchStudentById(Guid studentId)
{
    var response = await _httpClient.GetAsync($"{studentsUrl}/{studentId}");
    // ensure 200 response code
    if (response.StatusCode != StatusCode.OK)
        return null;
    // try to deserialize a type
    try
    {
        var jsonResponse = await response.Content.ReadAsStringAsync();
        return JsonConvert.DeserializeObject<Student>(jsonResponse);
    }
    catch (Exception e) // some deserialization exception
    {
        return null;
    }
}

This is a lot better. We’ve moved our if checks down into this function so they can be done in one place instead of many. Code like this can (and does) exist in perpetuity without a ton of issues. Consider the following questions, though:

Does the caller expect that a student with the provided Id might not exist or are they checking for null everywhere after?
If I wanted to write tests against this endpoint, how could I test success and failure scenarios?
- If we don’t receive a 200 back
- If deserialization fails because of e.g., ill-formatted JSON or even an unexpected schema change

The scenario where an Id wasn’t found is treated the same as all failure scenarios – return null.

Too often people turn to exceptions to differentiate between actual failure and simply a not-found Student:

public async Task<Student?> FetchStudentById(Guid studentId)
{
    var response = await _httpClient.GetAsync($"{studentsUrl}/{studentId}");
    
    // throws (bad)
   response.EnsureSuccessStatusCode();

    // check for no student found: endpoint returns a string that says "null"
    var jsonResponse = await response.Content.ReadAsStringAsync();
    if (jsonResponse == "null")
        return null; 

    // otherwise, try to deserialize, which throws upon failure (bad)
    return JsonConvert.DeserializeObject<Student>(jsonResponse);
}

While we can now write tests against this endpoint to differentiate between no student found and actual failure (good), we’ve created problems for all code that calls FetchStudentById:

Callers still must ask themselves if null means failure or just a missing student, because null is so ubiquitously used for both
- writing documentation that explains that yes, null, really means a missing student only goes so far; developers have to consult your ugly docs to find this out
All callers now need to wrap the call in a try...catch, unless they want the exception to crash the service
- This is sometimes kind of, sort of okay because most services wrap all operations in a top level try-catch which might just log it for developers to see, and then maybe surface a generic error to any end user

The first and easiest thing issue to address is the ambiguity of a null return. You can do this by exposing two functions, one returns null and one does not*:

public async Task<Student?> FetchStudentByIdOptional(Guid studentId)
{
    var response = await _httpClient.GetAsync($"{studentsUrl}/{studentId}");
    
    // throws (bad)
   response.EnsureSuccessStatusCode();

    // check for no student found: endpoint returns a string that says "null"
    var jsonResponse = await response.Content.ReadAsStringAsync();
    if (jsonResponse == "null")
        return null; 

    // otherwise, try to deserialize, which throws upon failure (bad)
    return JsonConvert.DeserializeObject<Student>(jsonResponse);
}

public async Task<Student> FetchStudentById(Guid studentId)
{
    var optionalStudent = FetchStudentByIdOptional(studentId);
    if (optionalStudent == null)
        throw new Exception($"No student with Id {studentId} found");
    return optionalStudent!;
}

(*Note: I’d actually recommend using a type that indicates the possibility of a Student or not, via a Optional type, but it’s not realistic to ask developers to pull in a third party library to support this, or alternatively write their own. If you’re already using a language that has a Maybe monad built in, you probably already came to all these conclusions yourself because the language forced you into it.)

This is better; we’ve strengthened the type of FetchStudentById so that callers know it will never return null, and as result your codebase won’t be littered with unnecessary null checks.

Callers also know that the ...Optional method’s null means no student was found. You might adopt LINQ’s FirstOrDefault or similar terminology to communicate this even better.

However, callers are still faced with the issue of exceptions being thrown.

That is a separate discussion and one that has been deliberated on in type-driven design and functional programming spaces in general.

Finally, I want to briefly return to the notion of testing:

What if I want to write more specific tests around expected response codes or additional information being returned by an endpoint?

It’s not uncommon for endpoint to return wrapped data that indicates metadata about the response, like a list of warnings regarding the request, deprecation, or even information about api versioning

Well, the answer to that is the same as the answer to exceptions; strengthen the type returned by the function, and let callers ignore scenarios they don’t care about!

Conclusion

I’ll leave you with a a restatement of this post’s title:

Push ifs toward the source of data