One pytest Marker to Track the Performance of Your Tests

Posted on May 2nd, 2024 by

Adrien Cacciaguerra

Arthur Pastel

While unit tests have become a standard for ensuring that the code performs as expected, they focus on checking the logic, not the performance. Yet, performance regressions can be just as dangerous as functional bugs, putting your whole software at risk. This is why performance should be checked early while testing the code, the icing on the cake being to check it in CI environments.

Effortless performance testing with pytest

Benchmarking is a way of testing a block of code's performance. It is like a test case but for performance. It will execute the code and measure how long it takes to run.

Let's see how to implement that with pytest. First, install the pytest-codspeed library to enable the benchmark marker and fixture:

pip install pytest-codspeed

Then, you're ready to use the @pytest.mark.benchmark marker for measuring performance. You can use it directly on a single test:

import pytest 
@pytest.mark.benchmark
def test_my_fn():
    inputs = gen_inputs()
    results = my_fn(inputs)
    assert results == "expected_result"

But you can also apply it at the module level by using the global pytestmark variable, effectively enabling performance testing on all the tests contained within it:

import pytest
pytestmark = pytest.mark.benchmark

# The rest of the test cases are now performance tests as well

You can then run the performance tests locally to ensure that everything works:

pytest tests/ --codspeed

Rely on your CI to avoid Spray and Pray

While being a good starting point, isolated runs do not work for long-term performance tracking and the risk of missing important performance changes is just too high. Regressions will surface when you less expect them and it's essential to automate these checks into your CI pipeline. This ensures that any performance degradation is caught automatically during the development cycle and builds a history of your codebase performance.

Using the CodSpeed test runner helps a lot to make the measurement extremely steady. Our runner relies on CPU simulation, enabling us to separate the noisy neighbours (other VMs, workloads, users) from the precious workload you want to measure.

A typical setup with the runner in GitHub Actions would be as simple as:

- uses: CodSpeedHQ/action@v3
  with:
    run: pytest tests/ --codspeed

This setup not only runs your tests but also uploads the results to CodSpeed, where you can track performance over time.

Measuring only what matters

Sometimes, you want more granularity in what is measured. For example, you do not want to measure the time it took to generate the inputs for calling our function or the assertions after getting the result; and instead focus only on the actual function call.

We can modify the unit test:

def test_my_fn(benchmark):
    inputs = gen_inputs()
    results = benchmark(my_fn, inputs)
    assert results == "expected_result"

This test uses the benchmark fixture to only measure the execution time of my_fn. The fixture makes it easy to focus on what matters—how long it takes your function to execute under test conditions.

Using the benchmark fixture will automatically mark the test as a benchmark, without having to use the pytest.mark.benchmark marker.

Optimizing hot paths with Differential Profiling

When you encounter a performance regression, your next mission(if you accept it🤵) is often to investigate and find what/who/when/why was this issue introduced in the first place.

This is where differential profiling comes in handy. This allows to compare two execution profiles to find exactly what changed between two separate measurements.

The good news is CodSpeed automatically profiles your benchmark's code while measuring performance. So if you spot a regression, you'll have all the data to investigate:

Execution profile

test_parse_pr

parse_pr (92.89%)

prepare_parsing_body (63.21%)

parse_body (57.32%)

parse_issue_fixed (25.49%)

log_metrics (19.64%)

send_event (8.01%)

parse_title (28.67%)

modify_title (6.49%)

log_metrics (21.33%)

send_event (8.70%)

__create_fn__.<locals>.__init__ (1.32%)

Sample flame graph with regressions, improvements, and added code

Takeaways

Integrating performance testing into your development process with tools like pytest and CodSpeed fosters a culture of continuous improvement. It ensures that performance considerations are never an afterthought but a key component of your software development lifecycle from the ground up.

To see CodSpeed in action, you can check out open-source repositories using the tool in the explore page. A lot of them are actually using the pytest integration we just talked about, like pydantic and polars

Last but certainly not least, shout out to patrick91, who pioneered this use case and whose contributions have made it significantly easier for developers to incorporate benchmarking into their existing unit tests.

Resources

pytest-codspeed: the plugin for pytest
@codspeed/action: the GitHub Action to run benchmarks and generate flame graphs
@codspeed/runner: the generic CodSpeed runner working with various CI providers
CodSpeed Docs: Python Integration
Pinpoint performance regressions with CI-Integrated differential profiling

Stop Guessing,
Start Measuring.

Book a Demo Get Started For Free

Resources Home Pricing Docs Blog GitHub Changelog Advent 🎄

Getting Started Sample repository Explore repositories Support

AboutCareers Contact Us Terms of Service Privacy Policy

{311862}Analyzed Commits

Explore Repos

Backed by