While unit tests have become a standard for ensuring that the code performs as expected, they focus on checking the logic, not the performance. Yet, performance regressions can be just as dangerous as functional bugs, putting your whole software at risk. This is why performance should be checked early while testing the code, the icing on the cake being to check it in CI environments.
Benchmarking is a way of testing a block of code's performance. It is like a test case but for performance. It will execute the code and measure how long it takes to run.
Let's see how to implement that with pytest
. First, install the
pytest-codspeed
library to enable the benchmark marker and fixture:
pip install pytest-codspeed
Then, you're ready to use the @pytest.mark.benchmark
marker for measuring
performance. You can use it directly on a single test:
+ import pytest
+ @pytest.mark.benchmark
def test_my_fn():
inputs = gen_inputs()
results = my_fn(inputs)
assert results == "expected_result"
But you can also apply it at the module level by using the global pytestmark
variable, effectively enabling performance testing on all the tests contained
within it:
import pytest
pytestmark = pytest.mark.benchmark
# The rest of the test cases are now performance tests as well
You can then run the performance tests locally to ensure that everything works:
pytest tests/ --codspeed
While being a good starting point, isolated runs do not work for long-term performance tracking and the risk of missing important performance changes is just too high. Regressions will surface when you less expect them and it's essential to automate these checks into your CI pipeline. This ensures that any performance degradation is caught automatically during the development cycle and builds a history of your codebase performance.
Using the CodSpeed test runner helps a lot to make the measurement extremely steady. Our runner relies on CPU simulation, enabling us to separate the noisy neighbours (other VMs, workloads, users) from the precious workload you want to measure.
A typical setup with the runner in GitHub Actions would be as simple as:
- uses: CodSpeedHQ/action@v3
with:
run: pytest tests/ --codspeed
This setup not only runs your tests but also uploads the results to CodSpeed, where you can track performance over time.
Sometimes, you want more granularity in what is measured. For example, you do not want to measure the time it took to generate the inputs for calling our function or the assertions after getting the result; and instead focus only on the actual function call.
We can modify the unit test:
def test_my_fn(benchmark):
inputs = gen_inputs()
results = benchmark(my_fn, inputs)
assert results == "expected_result"
This test uses the benchmark
fixture to only measure the execution time of
my_fn
. The fixture makes it easy to focus on what matters—how long it takes
your function to execute under test conditions.
Using the benchmark
fixture will automatically mark the test as a benchmark,
without having to use the pytest.mark.benchmark
marker.
When you encounter a performance regression, your next mission(if you accept it🤵) is often to investigate and find what/who/when/why was this issue introduced in the first place.
This is where differential profiling comes in handy. This allows to compare two execution profiles to find exactly what changed between two separate measurements.
The good news is CodSpeed automatically profiles your benchmark's code while measuring performance. So if you spot a regression, you'll have all the data to investigate:
Integrating performance testing into your development process with tools like
pytest
and CodSpeed fosters a culture of continuous improvement. It ensures
that performance considerations are never an afterthought but a key component of
your software development lifecycle from the ground up.
To see CodSpeed in action, you can check out open-source repositories using the
tool in the explore page. A lot of them are
actually using the pytest
integration we just talked about, like
pydantic and
polars
Last but certainly not least, shout out to patrick91, who pioneered this use case and whose contributions have made it significantly easier for developers to incorporate benchmarking into their existing unit tests.
pytest