Pragmatic Testing With Pytest-Locker

Locking test data output to improve testing efficiency

Press enter or click to view image in full size

Photo by the author.

In this article, I will introduce the idea of “locking” the output, essentially asserting that data has not changed since the last test run. This is implemented in Pytest-Locker, a package that “locks” data and integrates with pytest. While this goes against the best practices of TTD, it is time-effective, provides clarity, and is very effective in text- or data-heavy scenarios — especially with volatile interfaces.

When developing code, people are always searching for the right balance between writing tests and writing code. Additionally, just getting your code coverage up is not the entire story. It is also important that the tests fail if any unexpected behavior occurs.

The Struggle: When Testing Is Tedious

For years, I have struggled to find the right depth of testing in two domains:

Systems that involve a lot of text (e.g. chatbots).
Systems with interfaces that share data in nested infrastructures.

This becomes even worse if one or both of the following applies:

It is more likely that the requirements change than that the code breaks to a developer’s error, which makes the time invested in testing harder to justify.
You are interfacing with external services that you do not want to hit during testing (e.g. because you are charged for use).

In these scenarios, the main issue is that you need a lot of tests with exceptional amounts of assertions to cover all behavior.

Inspiration: Towards a Solution

Inspiration for a solution hit me when Bryan Okken (from Python Bytes and Test&Code) said that his experience with testing often consists of simply running code and copying the code’s output into an assertion statement, which then is the test for the new code. With this workflow, you are effectively locking in behavior.

This goes against the Red/Green/Refactor mantra in TTD. But it is in effect what many software engineers do. Furthermore, while it doesn’t make sense when you have to implement a well-defined contract, it does make sense when you are trying stuff until “it works.” In the “it works” scenario, you just want to prevent it from not working anymore.

This try-copy-assert method still has some room for improvement, though:

It would be great if it was clear to anyone reading the test that this pattern was used, implicating that it is not based on a contract per se but that the assertion error could be interpreted as a notification of changed behavior.
It still requires the manual steps of running, copying, and inserting the output into a test.
In the scenarios mentioned above, you suddenly end up with a lot of tests or a nested dictionary cluttering up your test code.

The Solution: Pytest-Locker

My solution to automate and alleviate some drawbacks from the try-copy-assert method is Pytest-Locker. The package is fairly simple. It exposes one fixture called locker, which, in turn, exposes a single method called lock.

Usage

Add from pytest_locker import locker to your conftest.py file. This ensures that the locker fixture is available everywhere in your testing code. You can also just import it in the places where you use it.
You call locker.lock on the data that you want to assert.
Run pytest with -s or --capute no. Just running pytest will result in a failure if the object to be locked has not been locked yet. -s allows Pytest-Locker to prompt the user to allow or deny the assertion of the test. When allowed, this result is stored and -s will not be required until behavior changes.
Commit the locked files (stored in .pytest-locker/). Not only is this required for the tests in CI/CD to succeed, but it is also a powerful way to check the impact of changes in the code’s behavior during code reviews.

One of the nice things about Pytest-Locker is that it doesn’t just fail if the given string doesn’t match the stored string. It shows a diff and, if possible, asks the user if it should accept the new string.

An example

In the scenario below, we have a template that we want to fill and some test code that fills it. We want to check the template for the given input. In this example, the locker.lock(result) line results in pytest asserting change between result and the content of test.unit.test_template.1.txt. If that file does not exist yet, it will prompt the user to accept or deny result and create the file based on the reaction.

The example above is very simple, but Pytest-Locker has been invaluable to code with more complex template selection and filling logic (e.g. chatbots).

Note: A more elaborate example can be found on GitHub.

Another scenario

Another scenario where I found Pytest-Locker exceptionally useful is when I have to call external systems using APIs where various fields have to be set simultaneously. Of course, you do not want to call these APIs while testing, but you do want to test the code that uses it. What I do instead is patch the method that calls the API using MagicMock and locking the parameters that would be sent to the API.

This way, I might not ensure that the API call would result in the desired behavior, but I do ensure that the API is called in an expected manner.

Additional benefits

Pytest-Locker brings a lot of clarity when using the Given-When-Then style of testing (aka Arrange-Act-Assert). In my experience, the “When” part of the test is usually a single line, and with Pytest-Locker, the “Then” part of the test will also often be a single line. This means most of your test is just setting up the initial state (the “Given” in the GWT model).

This means that arguably the most important part of the test, the “When,” is no longer hidden between two long chunks of code. It can be easily found as the second-last line of your test.

Drawbacks

Code needs to be deterministic: While this can be mitigated by setting seeds in many scenarios, this might not always be possible and might be a larger hassle than writing standard assertions.
You have to serialize the object that you want to store to a string. This is usually quite simple in Python. Even better, if you do that often, you can even create another class that inherits from Locker with a corresponding fixture that does the serialization automatically. However, this is still a problem and I’ll address it further in the next section.

Things I’d Like to Add

There is still room for improvement. Most of it comes down to better integration with pytest.

For instance, it would be great if you did not have to serialize the object manually. However, automatic serialization is only half the problem. After the object is serialized, you still want a clean, readable diff between the given object and the stored object. This is something that pytest does great when assertions fail. However, we want to see the diff before the test fails so that the user can accept the changes. As of yet, I have not found any way to do that.

Additionally, it would be nice to first see what tests fail, succeed, and still need attention (by accepting or denying changes to the locks) before the changes are reviewed. I do think that this is possible in pytest, but it seems like a large, non-trivial task (contributions are welcome).