> ## Documentation Index
> Fetch the complete documentation index at: https://codspeed.io/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# CPU Simulation Instrument

> Learn how to use CodSpeed's CPU simulation instrument for consistent, hardware-agnostic performance measurement in your benchmarks

export const CIWorkflow = ({minimal = false, enableWorkflowDispatch = true, runsOn = "ubuntu-latest", highlight = [], mode, modes, submodules = false, preSteps = [], buildSteps = ["# ...", "# Setup your environment here:", "#  - Configure your Python/Rust/Node version", "#  - Install your dependencies", "#  - Build your benchmarks (if using a compiled language)", "# ..."], benchmarkCommand = ["<Insert your benchmark command here>"], jobName = "Run benchmarks", env = {}}) => {
  const modeList = modes || (mode ? [mode] : undefined);
  if (!modeList || modeList.length === 0) {
    throw new Error("mode or modes is required");
  }
  const indent = (lines, depth) => {
    const reindentedLines = lines.map(l => l.length === 0 ? l : (" ").repeat(depth) + l);
    return reindentedLines.join("\n");
  };
  const workflowDispatchSection = enableWorkflowDispatch ? "  # `workflow_dispatch` allows CodSpeed to trigger backtest\n" + "  # performance analysis in order to generate initial data.\n" + "  workflow_dispatch:\n" : "";
  let yaml = "";
  if (!minimal) {
    yaml += `
name: CodSpeed Benchmarks

on:
  push:
    branches:
      - "main" # or "master"
  pull_request:
`;
    yaml += workflowDispatchSection;
  }
  yaml += `
jobs:
  benchmarks:
    name: ${jobName}
    runs-on: ${runsOn}`;
  if (!minimal) {
    yaml += `
    permissions: # optional for public repositories
      contents: read # required for actions/checkout
      id-token: write # required for OIDC authentication with CodSpeed`;
  }
  if (preSteps.length > 0) yaml += "\n" + indent(preSteps, 4);
  yaml += `
    steps:
      - uses: actions/checkout@v5`;
  if (submodules) {
    const value = typeof submodules === "string" ? submodules : "true";
    yaml += `\n        with:\n          submodules: ${value}`;
  }
  yaml += "\n" + indent(buildSteps, 6);
  const modeValue = modeList.join(",");
  yaml += `
      - name: Run the benchmarks
        uses: CodSpeedHQ/action@v4
        with:
          mode: ${modeValue}`;
  if (benchmarkCommand.length > 0) {
    const indentedBenchCommand = benchmarkCommand.length > 1 ? benchmarkCommand[0] + "\n" + indent(benchmarkCommand.slice(1), 12) : benchmarkCommand;
    const runLine = indent(["run: "], 10) + indentedBenchCommand;
    yaml += `\n${runLine}`;
  }
  const envEntries = Object.entries(env);
  if (envEntries.length > 0) {
    const envLines = ["env:", ...envEntries.map(([k, v]) => `  ${k}: ${v}`)];
    yaml += "\n" + indent(envLines, 8);
  }
  return <CodeBlock language="yaml" highlight={JSON.stringify(highlight)} {...minimal || ({
    filename: ".github/workflows/codspeed.yml",
    icon: "github"
  })}>
      {yaml}
    </CodeBlock>;
};

## What is CPU Simulation?

CodSpeed instruments your benchmarks to measure the performance of your code,
simulating the CPU behavior. A benchmark will be run **only once** and the CPU
behavior will be simulated. This ensures that the measurement is as accurate as
possible, taking into account not only the instructions executed but also the
cache and memory access patterns. The simulation gives us an equivalent of the
CPU cycles that includes cache and memory access.

### Estimating Cycles

The CPU simulation takes into account the following factors:

1. **Executed instructions** (`Ir`): the baseline cost of your code
2. **L1 cache misses**: data that must be fetched from L2/L3 cache, this can
   take 10-40 cycles
3. **LL (Last-level) cache misses**: data that must be fetched from RAM, this
   can take 100+ cycles

The total number of cycles is calculated like this:

$$
\text{cycles} \approx \text{Ir} + (\text{L1 Misses} \times \text{L2/L3 Cost}) + (\text{LL Misses} \times \text{RAM Cost})
$$

### Converting Cycles to Time

Once we have the number of cycles for a benchmark, we transform it into an
execution time measurement by using the following formula, where `FREQUENCY` is
a constant set to the frequency (number of instructions executed per second) of
a real CPU:

$$
execution\_time = \frac{cycles}{FREQUENCY}
$$

We then calculate the **execution speed** of the benchmark by taking the inverse
of the execution time:

$$
speed = \frac{1}{execution\_time}
$$

This is the displayed metric in the CodSpeed reports.

<Note>
  **Why choose execution speed over execution time?**

  A performance increase of a benchmark will increase its execution speed. Same
  for a performance regression. However, if execution time was used, a performance
  increase of a benchmark would result in a decrease in its execution time. This
  would be counter-intuitive.
</Note>

### System Calls

System calls play a critical role in the performance of software, but they
present unique challenges for accurate measurement. **A system call is a request
made by a program to the operating system's kernel**, typically for I/O
operations such as reading from or writing to files, communicating over a
network, or interacting with hardware devices.

Due to their nature, **system calls introduce variability in execution time**.
This variability is influenced by several factors, including system load,
network latency, and disk I/O performance. As a result, the execution time of
system calls can fluctuate significantly, making them the most inconsistent part
of a program's execution time.

To ensure that our execution speed measurements are as stable and reliable as
possible, **CodSpeed CPU Simulation mode does not include system calls in the
measurement**. Instead, we focus solely on the code executed within user
space(the code you wrote), excluding any time spent in system calls. This
approach allows us to provide a clear and consistent metric for the execution
speed of your code, independent of your hardware and all variability that it can
create.

<Tip>
  **Walltime measurement with CodSpeed Macro Runners**

  If your the code you wish to optimize and measure relies heavily on system
  calls, you can use CodSpeed Macro Runners combined with our WallTime instrument.
  You can find more information in the [Walltime](/instruments/walltime) section
  of the docs.
</Tip>

Still, **the wall time spent on system calls is recorded and this data is
available in the trace view**, providing insight into how much time is consumed
by system interactions. While these times are not included in the overall
execution speed metric, they offer valuable information for performance
analysis.

<Info>
  **Roadmap for system calls**

  In the future, we plan to enhance CodSpeed by emulating system calls. This will
  allow us to more accurately anticipate the performance impact of system calls,
  further improving the reliability and comprehensiveness of our performance
  measurements.
</Info>

## Legacy Terminology

Previously, this instrument was referred to as "instrumentation" or
"instrumentation mode".

This terminology is being phased out in favor of "CPU simulation" to better
reflect what the instrument does: simulating CPU behavior. While the old
`instrumentation` value is still accepted for backward compatibility, it will be
removed in a future release.

We recommend updating your configuration and your integration library to use
`simulation` instead.

<Note>
  Going forward, what we will refer to as **"instrumentation"** represents the
  generic overlay that CodSpeed applies to your benchmarks to collect
  performance data and profiling information.

  It applies to both CPU Simulation and Walltime instruments.
</Note>

## Automated Profiling

When using the CPU simulation instrument, CodSpeed automatically collects
[profiling data and generates flame graphs](/features/profiling) for each
benchmark when available. This allows you to quickly identify performance
bottlenecks and their root causes.

### Pre-requisites

To enable profiling with the CPU simulation instrument, ensure you meet the
following minimum version requirements:

* **Node.js**: Node 16 or higher, and the following minimum versions of the
  integrations
  * [`@codspeed/vitest-plugin>=2.3.1`](https://github.com/CodSpeedHQ/codspeed-node/releases/tag/v2.3.1)
  * [`@codspeed/tinybench-plugin>=2.2.0`](https://github.com/CodSpeedHQ/codspeed-node/releases/tag/v2.2.0)
  * [`@codspeed/benchmark.js-plugin>=2.2.0`](https://github.com/CodSpeedHQ/codspeed-node/releases/tag/v2.2.0)
* **Python**: Python 3.12 or higher and
  [`pytest-codspeed>=2.0.0`](https://github.com/CodSpeedHQ/pytest-codspeed/releases/tag/v2.0.0)
* **Rust**: Profiling is enabled by default with any version of the integration
  library
* **C++**: Profiling is enabled by default with any version of the integration
  library

### Inspector Metrics

When you hover over a span in the flame graph, the inspector displays CPU
simulation-specific metrics:

<img src="https://mintcdn.com/codspeed/J_9QffCKgr3Hbs6D/instruments/cpu/assets/tooltip-explanation.excalidraw.png?fit=max&auto=format&n=J_9QffCKgr3Hbs6D&q=85&s=207eebeeb34900a7d27eed74c14c9b20" alt="Flamegraph inspector" width="2383" height="1357" data-path="instruments/cpu/assets/tooltip-explanation.excalidraw.png" />

* **Self time**: The simulated execution time spent in the function body only,
  excluding time spent in child function calls.

* **Total time**: The simulated execution time spent in the function including
  all its children.

The time bars are broken down into components that show what is limiting
progress:

* **Instructions**: Time spent executing CPU instructions.
* **Cache**: Time spent due to CPU cache misses (L1, L2, L3).
* **Memory**: Time spent waiting for main memory access.

This breakdown helps you identify whether a function is instruction-bound,
cache-bound, or memory-bound, guiding your optimization efforts.

## Usage with GitHub Actions

To enable CPU Simulation in your GitHub Actions workflow, ensure you are using
`mode: simulation` in the CodSpeed Action configuration.

<CIWorkflow minimal mode="simulation" highlight={[19]} />

## Next Steps

<CardGroup cols={2}>
  <Card title="Profiling" href="/features/profiling" icon="bars-sort">
    Learn how to read flame graphs and use profiling to optimize your code
  </Card>

  <Card title="Unexpected Regressions" href="/instruments/cpu/regression-causes" icon="triangle-exclamation">
    Understand why benchmarks can regress without code changes
  </Card>

  <Card title="Walltime Instrument" href="/instruments/walltime" icon="stopwatch">
    Measure real-world execution time including system calls and I/O
  </Card>
</CardGroup>
