Variance Categories
Variance can be separated into different groups, which will help understand and fix multiple regressions. The categories include:- Compiler/Linker variance: Whenever the built binary changes, this can
cause code to be executed differently.
- Cache variance: This describes variance caused by different cache behavior. In CI, each benchmark process typically runs once per commit, so cold-cache effects can influence results.
- State-dependent variance: This describes all the variance that is caused
by changing the underlying state of the system.
- Allocator variance: Allocators can execute different code paths, depending on the current state of the allocator. Changing the memory fragmentation at a previous point in time, can cause variance in benchmarks that are executed later.
- Environment variance: Variance caused by the runtime environment.
- CPU variance: If code behaves differently based on the CPU, variance can be introduced. This happens in heavily optimized libraries/programs that might try to detect cache sizes, CPU features or the number of CPU cores.
- Kernel variance: Syscalls can cause variance in benchmarks, as the kernel might execute different code paths depending on the current state of the system.
Strategies
One benchmark, one binary
Most of the issues come from multiple benchmarks being written and run in the same binary. Seemingly unrelated changes to the code, can cause ripple effects that are hard to track down. To fix this, we can compile each benchmark into its own binary. This will fix unrelated variance, as compilers (usually) produce the same binary when given the same input. The only downside to this approach is the increased linker/compilation overhead. For N benchmarks, we will have to compile N binaries. We only recommend this approach for micro-benchmarks which observe a significant amount of variance.How to implement in Rust
In Rust, this can be done by adding a feature flag for each benchmark, which allows us to compile each benchmark into its own binary.Cargo.toml
How to implement in C++
When using C++, we can achieve this by wrapping eachBENCHMARK() in a define.
This allows us to conditionally include/exclude benchmarks while building.
We’re actively exploring how to implement this in our integrations. If you have further questions, please reach out to us via Discord or email.