> ## Documentation Index > Fetch the complete documentation index at: https://codspeed.io/docs/llms.txt > Use this file to discover all available pages before exploring further. # How to Benchmark Java with JMH? > Learn how to measure the performance of your Java code using JMH (Java Microbenchmark Harness) by writing and running benchmarks locally and continuously in CI to catch regressions. export const CIWorkflow = ({minimal = false, enableWorkflowDispatch = true, runsOn = "ubuntu-latest", highlight = [], mode, modes, submodules = false, preSteps = [], buildSteps = ["# ...", "# Setup your environment here:", "# - Configure your Python/Rust/Node version", "# - Install your dependencies", "# - Build your benchmarks (if using a compiled language)", "# ..."], benchmarkCommand = [""], jobName = "Run benchmarks", env = {}}) => { const modeList = modes || (mode ? [mode] : undefined); if (!modeList || modeList.length === 0) { throw new Error("mode or modes is required"); } const indent = (lines, depth) => { const reindentedLines = lines.map(l => l.length === 0 ? l : (" ").repeat(depth) + l); return reindentedLines.join("\n"); }; const workflowDispatchSection = enableWorkflowDispatch ? " # `workflow_dispatch` allows CodSpeed to trigger backtest\n" + " # performance analysis in order to generate initial data.\n" + " workflow_dispatch:\n" : ""; let yaml = ""; if (!minimal) { yaml += ` name: CodSpeed Benchmarks on: push: branches: - "main" # or "master" pull_request: `; yaml += workflowDispatchSection; } yaml += ` jobs: benchmarks: name: ${jobName} runs-on: ${runsOn}`; if (!minimal) { yaml += ` permissions: # optional for public repositories contents: read # required for actions/checkout id-token: write # required for OIDC authentication with CodSpeed`; } if (preSteps.length > 0) yaml += "\n" + indent(preSteps, 4); yaml += ` steps: - uses: actions/checkout@v5`; if (submodules) { const value = typeof submodules === "string" ? submodules : "true"; yaml += `\n with:\n submodules: ${value}`; } yaml += "\n" + indent(buildSteps, 6); const modeValue = modeList.join(","); yaml += ` - name: Run the benchmarks uses: CodSpeedHQ/action@v4 with: mode: ${modeValue}`; if (benchmarkCommand.length > 0) { const indentedBenchCommand = benchmarkCommand.length > 1 ? benchmarkCommand[0] + "\n" + indent(benchmarkCommand.slice(1), 12) : benchmarkCommand; const runLine = indent(["run: "], 10) + indentedBenchCommand; yaml += `\n${runLine}`; } const envEntries = Object.entries(env); if (envEntries.length > 0) { const envLines = ["env:", ...envEntries.map(([k, v]) => ` ${k}: ${v}`)]; yaml += "\n" + indent(envLines, 8); } return {yaml} ; }; export const TocConfig = ({hideBelow}) => { const ALL_LEVELS = ["h2", "h3", "h4"]; const HEADING_TO_DEPTH = { h2: "0", h3: "1", h4: "2" }; const cutoff = ALL_LEVELS.indexOf(hideBelow); if (cutoff === -1) return null; const hidden = ALL_LEVELS.slice(cutoff + 1); if (!hidden.length) return null; const selectors = hidden.map(level => `.toc-item[data-depth="${HEADING_TO_DEPTH[level]}"]`).join(",\n"); return ; }; ## Why JMH? This guide uses [JMH (Java Microbenchmark Harness)](https://github.com/openjdk/jmh), the standard benchmarking framework for the JVM. JMH is developed as part of the OpenJDK project by the same engineers who build the JVM itself, so it understands JVM internals like JIT compilation, dead code elimination, and constant folding that can silently invalidate naive benchmarks. It handles warmup, fork isolation, and statistical analysis out of the box so you can focus on writing the code you want to measure. This guide covers [Maven](https://maven.apache.org/) and [Gradle](https://github.com/melix/jmh-gradle-plugin). JMH also works with [SBT](https://github.com/ktoso/sbt-jmh). ## Your First Benchmark Let's start with the simplest possible JMH benchmark: a single method that measures how fast a recursive Fibonacci function runs. ### Project Setup The recommended way to use JMH with Maven is through its archetype, which generates a project pre-configured with the annotation processor and uber-JAR packaging: ```sh title=terminal icon="square-terminal" theme={null} mvn archetype:generate \ -DinteractiveMode=false \ -DarchetypeGroupId=org.openjdk.jmh \ -DarchetypeArtifactId=jmh-java-benchmark-archetype \ -DgroupId=com.example \ -DartifactId=my-benchmarks \ -Dversion=1.0 ``` This creates a `my-benchmarks/` directory with the following structure: The generated `pom.xml` includes `jmh-core` (the runtime library), `jmh-generator-annprocess` (the annotation processor that generates benchmark harness code at compile time), and `maven-shade-plugin` (packages everything into a single executable `benchmarks.jar`). Create a new project directory and add the [`jmh-gradle-plugin`](https://github.com/melix/jmh-gradle-plugin): ```groovy build.gradle icon="java" theme={null} plugins { id 'java' id 'me.champeau.jmh' version '0.7.3' } repositories { mavenCentral() } jmh { jmhVersion = '1.37' } ``` Then create the benchmark source directory: ```sh title=terminal icon="square-terminal" theme={null} mkdir -p src/jmh/java/com/example ``` The plugin handles the annotation processor and uber-JAR generation automatically. Do not add `jmh-core` to an existing project without the annotation processor. JMH needs to generate synthetic benchmark code at compile time. The archetype (Maven) and plugin (Gradle) handle this correctly. ### Writing the Benchmark The archetype generates a stub `MyBenchmark.java` with an empty `@Benchmark` method. Open `src/main/java/com/example/MyBenchmark.java` and replace its contents with: ```java src/main/java/com/example/MyBenchmark.java icon="java" theme={null} package com.example; import org.openjdk.jmh.annotations.Benchmark; public class MyBenchmark { @Benchmark public long fibonacci() { return fibonacci(30); } static long fibonacci(int n) { if (n <= 1) return n; return fibonacci(n - 1) + fibonacci(n - 2); } } ``` That's it. `@Benchmark` is the only annotation you need. JMH generates the measurement harness around it. The method **returns** its result, which prevents the JVM from eliminating the computation as dead code (more on this in [avoiding common pitfalls](#dead-code-elimination)). ### Building and Running Build the uber-JAR and run the benchmark: ```sh title=terminal icon="square-terminal" theme={null} cd my-benchmarks mvn clean verify java -jar target/benchmarks.jar ``` ```sh title=terminal icon="square-terminal" theme={null} cd my-benchmarks ./gradlew jmh ``` **This will take about 8 minutes.** JMH defaults are thorough: 5 forked JVMs, each running 5 warmup + 5 measurement iterations of 10 seconds. For a faster first run, add flags to reduce the iteration count: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar -f 1 -wi 3 -i 5 -w 1 -r 1 ``` ```sh title=terminal icon="square-terminal" theme={null} ./gradlew jmh -Pjmh.fork=1 -Pjmh.warmupIterations=3 -Pjmh.iterations=5 -Pjmh.warmup='1s' -Pjmh.timeOnIteration='1s' ``` These flags and annotations are explained in [Configuring Your Benchmark](#configuring-your-benchmark). You should see output like this: ```shellsession title=terminal icon="square-terminal" theme={null} # JMH version: 1.37 # VM version: JDK 17.0.18, OpenJDK 64-Bit Server VM, 17.0.18+8-Debian-1deb12u1 # Warmup: 5 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Threads: 1 thread, will synchronize iterations # Benchmark mode: Throughput, ops/time # Benchmark: com.example.MyBenchmark.fibonacci # Run progress: 0.00% complete, ETA 00:08:20 # Fork: 1 of 5 # Warmup Iteration 1: 320.348 ops/s # Warmup Iteration 2: 321.605 ops/s # Warmup Iteration 3: 323.393 ops/s # Warmup Iteration 4: 323.038 ops/s # Warmup Iteration 5: 321.964 ops/s Iteration 1: 320.996 ops/s Iteration 2: 320.143 ops/s Iteration 3: 323.586 ops/s Iteration 4: 322.946 ops/s Iteration 5: 321.108 ops/s # Run progress: 20.00% complete, ETA 00:06:40 # Fork: 2 of 5 ... Benchmark Mode Cnt Score Error Units MyBenchmark.fibonacci thrpt 25 320.479 ± 1.013 ops/s ``` Without any configuration, JMH automatically warmed up the JIT compiler across 5 separate JVM processes, collected 25 measurement iterations (5 per fork), and computed a tight 99.9% confidence interval. The default mode is **Throughput** (`thrpt`), measured in operations per second. **Understanding the results:** * **Mode**: The benchmark mode (`thrpt` = throughput, operations per second). * **Cnt**: Total measurement iterations across all forks (5 forks x 5 iterations \= 25). * **Score**: The measured value (higher is better for `thrpt`). * **Error**: The 99.9% confidence interval margin. The true value lies within `Score ± Error` with 99.9% confidence. * **Units**: `ops/s` = operations per second. ## Configuring Your Benchmark The previous benchmark used all JMH defaults. In practice, you want to embed settings into your benchmark class using annotations. This makes benchmarks self-describing and reproducible regardless of how they are invoked. Update `MyBenchmark.java`: ```java src/main/java/com/example/MyBenchmark.java icon="java" theme={null} package com.example; import org.openjdk.jmh.annotations.*; import java.util.concurrent.TimeUnit; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MILLISECONDS) @State(Scope.Thread) @Fork(1) @Warmup(iterations = 3, time = 1) @Measurement(iterations = 5, time = 1) public class MyBenchmark { private int n = 30; @Benchmark public long fibonacci() { return fibonacci(n); } static long fibonacci(int n) { if (n <= 1) return n; return fibonacci(n - 1) + fibonacci(n - 2); } } ``` Rebuild and run. No flags needed, everything is in the annotations: ```sh title=terminal icon="square-terminal" theme={null} mvn clean verify java -jar target/benchmarks.jar ``` ```sh title=terminal icon="square-terminal" theme={null} ./gradlew jmh ``` ```shellsession title=terminal icon="square-terminal" theme={null} # Benchmark mode: Average time, time/op # Benchmark: com.example.MyBenchmark.fibonacci # Fork: 1 of 1 # Warmup Iteration 1: 3.117 ms/op # Warmup Iteration 2: 3.131 ms/op # Warmup Iteration 3: 3.088 ms/op Iteration 1: 3.091 ms/op Iteration 2: 3.097 ms/op Iteration 3: 3.100 ms/op Iteration 4: 3.096 ms/op Iteration 5: 3.097 ms/op Benchmark Mode Cnt Score Error Units MyBenchmark.fibonacci avgt 5 3.096 ± 0.012 ms/op ``` The output now shows `avgt` (average time) in `ms/op`. A single fork completed in seconds instead of minutes. Computing `fibonacci(30)` takes about 3.1 milliseconds. The following sections break down each annotation. ### Benchmark Mode `@BenchmarkMode` controls what JMH measures. It can be placed on a class (applies to all methods) or on individual methods. Measures operations per second. Use this to quantify system capacity and compare throughput across implementations. Measures average time per operation. The general-purpose choice for latency benchmarking when you care about typical performance. Samples individual operation times and reports percentiles (p50, p90, p99, p99.9). Use this to understand tail latency, not just the average. Particularly useful because it reports percentiles directly: ```shellsession title=terminal icon="square-terminal" theme={null} MyBenchmark.fibonacci sample 177816 41.340 ± 0.936 ns/op MyBenchmark.fibonacci:p0.50 sample 38.000 ns/op MyBenchmark.fibonacci:p0.90 sample 44.000 ns/op MyBenchmark.fibonacci:p0.99 sample 58.000 ns/op MyBenchmark.fibonacci:p0.999 sample 279.183 ns/op MyBenchmark.fibonacci:p0.9999 sample 3199.859 ns/op ``` This reveals that while the median latency is 38ns, the p99.99 is 3.2 microseconds, an 84x spike. Percentile data like this is invaluable for understanding real-world latency characteristics. Measures the time for a single invocation with no warmup. Use this to benchmark cold-start performance and one-shot initialization costs. You can pass an array to run multiple modes in one benchmark run, e.g., `@BenchmarkMode({Mode.Throughput, Mode.AverageTime})`. Use `Mode.All` to run every mode at once, which is useful for exploratory benchmarking. ### State and Scope `@State` marks a class as a holder for benchmark data. Without it, you cannot use instance fields in benchmark methods. The `Scope` parameter controls how state is shared: Creates one state instance per thread with no sharing between threads. The default choice for most benchmarks. Shares one state instance across all threads. Use this when measuring contention and thread-safety overhead. Shares one state instance per thread group. Use this for asymmetric benchmarks (e.g., producer/consumer patterns). The benchmark class itself can be the state (as in our example), or you can define separate state classes: ```java title="Separate state class" icon="java" theme={null} @State(Scope.Benchmark) public static class SharedState { ConcurrentHashMap map = new ConcurrentHashMap<>(); } @Benchmark public void concurrentPut(SharedState state) { state.map.put("key", "value"); } ``` ### Fork, Warmup, Measurement, and Output Unit These annotations control the execution strategy and output formatting: Controls how many separate JVM processes to run. Forks run **sequentially**, not in parallel. Each fork starts a fresh JVM, isolating profile-guided optimizations and JIT compilation state. Use `jvmArgs` to control heap size, GC settings, and other JVM flags. Use `jvmArgsPrepend` or `jvmArgsAppend` to add flags without replacing defaults. ```java theme={null} @Fork(value = 3, jvmArgs = {"-Xms2G", "-Xmx2G"}) @Fork(value = 1, jvmArgsPrepend = {"-XX:+UseG1GC"}) ``` Controls how many iterations run before measurement begins, giving the JIT compiler time to optimize your code to steady state. Parameters: `iterations`, `time`, `timeUnit`. ```java theme={null} @Warmup(iterations = 5, time = 2, timeUnit = TimeUnit.SECONDS) ``` Controls how many iterations are recorded and included in the results. Accepts the same parameters as `@Warmup`: `iterations`, `time`, `timeUnit`. ```java theme={null} @Measurement(iterations = 10, time = 5, timeUnit = TimeUnit.SECONDS) ``` Controls the time unit displayed in results. Accepts any `java.util.concurrent.TimeUnit` value (e.g., `TimeUnit.NANOSECONDS`, `TimeUnit.MILLISECONDS`). ```java theme={null} @OutputTimeUnit(TimeUnit.MICROSECONDS) ``` ```java title="Configuration examples" icon="java" theme={null} // Quick feedback during development @Fork(1) @Warmup(iterations = 3, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS) // Reliable measurements for CI @Fork(value = 3, jvmArgs = {"-Xms2G", "-Xmx2G"}) @Warmup(iterations = 10, time = 5, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 10, time = 5, timeUnit = TimeUnit.SECONDS) ``` ### Threads `@Threads` controls how many threads run the benchmark **concurrently**. The default is 1. Combined with [`Scope.Benchmark`](#param-scope-benchmark), this is how you measure contention: ```java title="Multi-threaded benchmark" icon="java" theme={null} @Threads(4) @State(Scope.Benchmark) public class ConcurrencyBenchmark { private ConcurrentHashMap map = new ConcurrentHashMap<>(); @Benchmark public Integer concurrentPut() { return map.put(Thread.currentThread().hashCode(), 42); } } ``` Use `@Threads(Threads.MAX)` to use all available processors. ## Benchmarking with Parameters The previous examples all used a single input value (30). But what if you want to see how performance changes with different input sizes? This is where `@Param` comes in. ### Single Parameter Add a new benchmark class to test multiple input sizes: ```java src/main/java/com/example/FibonacciParameterized.java icon="java" theme={null} package com.example; import org.openjdk.jmh.annotations.*; import java.util.concurrent.TimeUnit; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MICROSECONDS) @State(Scope.Thread) @Fork(1) @Warmup(iterations = 3, time = 1) @Measurement(iterations = 5, time = 1) public class FibonacciParameterized { @Param({"5", "10", "15", "20", "30"}) private int n; static long fibonacci(int n) { if (n <= 1) return n; return fibonacci(n - 1) + fibonacci(n - 2); } @Benchmark public long fibRecursive() { return fibonacci(n); } } ``` `@Param` tells JMH to run the benchmark once for each value. Rebuild and run: ```sh title=terminal icon="square-terminal" theme={null} mvn clean verify java -jar target/benchmarks.jar FibonacciParameterized ``` ```sh title=terminal icon="square-terminal" theme={null} ./gradlew jmh -Pjmh.includes='FibonacciParameterized' ``` ```shellsession title=terminal icon="square-terminal" theme={null} Benchmark (n) Mode Cnt Score Error Units FibonacciParameterized.fibRecursive 5 avgt 5 0.013 ± 0.001 us/op FibonacciParameterized.fibRecursive 10 avgt 5 0.202 ± 0.001 us/op FibonacciParameterized.fibRecursive 15 avgt 5 2.277 ± 0.030 us/op FibonacciParameterized.fibRecursive 20 avgt 5 25.213 ± 0.295 us/op FibonacciParameterized.fibRecursive 30 avgt 5 3122.539 ± 55.028 us/op ``` The results clearly show the exponential O(2^n) growth of recursive Fibonacci: going from n=5 (13 nanoseconds) to n=30 (3.1 milliseconds), a factor of 240,000x. You can override `@Param` values from the command line without recompiling: ```sh theme={null} java -jar target/benchmarks.jar -p n=25,35 ``` ### Multiple Parameters Each `@Param` annotation applies to a single field, but you can use multiple `@Param` fields to benchmark across several dimensions. JMH runs all combinations automatically: ```java title="Multiple @Param fields" icon="java" theme={null} @Param({"1000", "10000"}) private int size; @Param({"ArrayList", "LinkedList"}) private String listType; ``` This produces four benchmark runs: `1000/ArrayList`, `1000/LinkedList`, `10000/ArrayList`, `10000/LinkedList`. ### Comparing Algorithms Parameters are powerful for comparing different implementations side-by-side. Let's benchmark recursive vs. iterative Fibonacci: ```java src/main/java/com/example/AlgorithmComparison.java icon="java" theme={null} package com.example; import org.openjdk.jmh.annotations.*; import java.util.concurrent.TimeUnit; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MICROSECONDS) @State(Scope.Thread) @Fork(1) @Warmup(iterations = 3, time = 1) @Measurement(iterations = 5, time = 1) public class AlgorithmComparison { @Param({"10", "20", "30"}) private int n; static long fibRecursive(int n) { if (n <= 1) return n; return fibRecursive(n - 1) + fibRecursive(n - 2); } static long fibIterative(int n) { if (n <= 1) return n; long a = 0, b = 1; for (int i = 2; i <= n; i++) { long temp = a + b; a = b; b = temp; } return b; } @Benchmark public long recursive() { return fibRecursive(n); } @Benchmark public long iterative() { return fibIterative(n); } } ``` ```shellsession title=terminal icon="square-terminal" theme={null} Benchmark (n) Mode Cnt Score Error Units AlgorithmComparison.iterative 10 avgt 5 0.003 ± 0.001 us/op AlgorithmComparison.iterative 20 avgt 5 0.004 ± 0.001 us/op AlgorithmComparison.iterative 30 avgt 5 0.006 ± 0.001 us/op AlgorithmComparison.recursive 10 avgt 5 0.203 ± 0.001 us/op AlgorithmComparison.recursive 20 avgt 5 25.596 ± 0.795 us/op AlgorithmComparison.recursive 30 avgt 5 3122.110 ± 33.573 us/op ``` The iterative version computes `fibonacci(30)` in 6 nanoseconds while the recursive version takes 3.1 milliseconds: over **500,000x faster**. This is the power of parameterized benchmarks: they make algorithmic trade-offs visible at a glance. ## Benchmarking Only What Matters Sometimes you have expensive setup that should not be included in your benchmark measurements. For example, generating test data or loading files. JMH provides `@Setup` and `@TearDown` annotations with different `Level` options to control when fixture methods run. ### Setup and Teardown Let's benchmark an outlier detection algorithm where the dataset generation is expensive but should not be measured: ```java src/main/java/com/example/OutlierDetection.java icon="java" theme={null} package com.example; import org.openjdk.jmh.annotations.*; import java.util.concurrent.TimeUnit; import java.util.ArrayList; import java.util.List; import java.util.Random; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MICROSECONDS) @State(Scope.Thread) @Fork(1) @Warmup(iterations = 3, time = 1) @Measurement(iterations = 5, time = 1) public class OutlierDetection { @Param({"10000", "100000", "1000000"}) private int size; private double[] data; @Setup(Level.Trial) public void setUp() { // NOT MEASURED: expensive data generation runs once before all iterations Random random = new Random(42); data = new double[size]; for (int i = 0; i < size; i++) { if (random.nextDouble() < 0.95) { data[i] = 100.0 + random.nextGaussian() * 15.0; } else { data[i] = 200.0 + random.nextDouble() * 100.0; } } } public static List detectOutliers(double[] data, double threshold) { double sum = 0; for (double v : data) sum += v; double mean = sum / data.length; double variance = 0; for (double v : data) variance += (v - mean) * (v - mean); variance /= data.length; double stdDev = Math.sqrt(variance); List outliers = new ArrayList<>(); for (int i = 0; i < data.length; i++) { double zScore = stdDev > 0 ? Math.abs((data[i] - mean) / stdDev) : 0; if (zScore > threshold) { outliers.add(i); } } return outliers; } @Benchmark public List findOutliers() { // MEASURED: only the outlier detection algorithm return detectOutliers(data, 2.0); } } ``` The `@Setup(Level.Trial)` method runs once before all measurement iterations. Only the `findOutliers()` method is timed: ```shellsession title=terminal icon="square-terminal" theme={null} Benchmark (size) Mode Cnt Score Error Units OutlierDetection.findOutliers 10000 avgt 5 22.659 ± 0.322 us/op OutlierDetection.findOutliers 100000 avgt 5 282.683 ± 2.461 us/op OutlierDetection.findOutliers 1000000 avgt 5 3079.753 ± 49.886 us/op ``` ### Fixture Levels JMH offers three levels for `@Setup` and `@TearDown`: Runs once per benchmark fork. Use this for loading files and building large datasets that are reused across all iterations. Runs before and after each measurement iteration. Use this to reset mutable state between iterations. Runs before and after each individual method call. Use sparingly - this adds overhead on every invocation. `Level.Invocation` adds timing overhead on every call. Only use it when the benchmark method is slow enough (milliseconds or more) that the fixture cost is negligible in comparison. Here is an example using `Level.Iteration` to provide fresh unsorted data for each iteration of a sorting benchmark: ```java src/main/java/com/example/SortBenchmark.java icon="java" theme={null} package com.example; import org.openjdk.jmh.annotations.*; import java.util.concurrent.TimeUnit; import java.util.ArrayList; import java.util.Collections; import java.util.List; import java.util.Random; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MICROSECONDS) @State(Scope.Thread) @Fork(1) @Warmup(iterations = 3, time = 1) @Measurement(iterations = 5, time = 1) public class SortBenchmark { @Param({"1000", "10000", "100000"}) private int size; private List data; @Setup(Level.Iteration) public void setUp() { // Regenerate unsorted data before each iteration Random random = new Random(42); data = new ArrayList<>(size); for (int i = 0; i < size; i++) { data.add(random.nextInt(size)); } } @Benchmark public List sortList() { List copy = new ArrayList<>(data); Collections.sort(copy); return copy; } } ``` ## Running Benchmarks from the Command Line The `benchmarks.jar` supports a rich set of command-line options. Here are the most useful ones: ### Filtering Benchmarks Run only benchmarks matching a regex: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar "FibonacciParameterized" ``` Exclude benchmarks matching a pattern: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar -e ".*Slow.*" ``` ### Overriding Parameters Override [`@Param`](#single-parameter), [`@Fork`](#param-fork), [`@Warmup`](#param-warmup), and [`@Measurement`](#param-measurement) from the command line: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar -p n=25,35 -f 3 -wi 5 -i 10 ``` Number of forks. Number of threads. Warmup iterations. Measurement iterations. Warmup iteration time (e.g., `2s`). Measurement iteration time. Override `@Param` values. Override benchmark mode (`thrpt`, `avgt`, `sample`, `ss`). Override time unit (`ns`, `us`, `ms`, `s`). ### Exporting Results JMH can export results in various formats for further analysis or visualization: Result format. One of `text`, `csv`, `scsv`, `json`, `latex`. Result file path. Where to write the output (e.g., `results.json`). For example, to export JSON results: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar -rf json -rff results.json ``` ### Using Profilers JMH ships with built-in profilers. List them with: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar -lprof ``` The most useful profilers: Samples hot methods and thread states to show where time is being spent. Reports allocation rate, GC pressure, and bytes allocated per operation. Reports JIT compilation activity during the measurement window. Reports per-operation hardware counters: cache misses, branch mispredictions, and CPI. Linux only. Generates CPU flamegraphs using [async-profiler](https://github.com/async-profiler/async-profiler). For example, to measure allocation pressure: ```sh title=terminal icon="square-terminal" theme={null} java -jar target/benchmarks.jar OutlierDetection -prof gc ``` This adds GC metrics to the output, showing bytes allocated per operation (`gc.alloc.rate.norm`) and GC event counts, essential for understanding allocation-heavy code. ## Avoiding Common Pitfalls The JVM is a sophisticated optimizing runtime. Without care, it can silently eliminate or transform the code you are trying to measure, producing misleading results. JMH is designed to help, but you still need to follow certain patterns. ### Dead Code Elimination If a computation's result is never used, the JIT compiler may eliminate it entirely: ```java title="Dead code elimination" icon="java" theme={null} // BAD: result is discarded, JVM may eliminate the entire computation @Benchmark public void measureWrong() { Math.log(x); } // GOOD: returning the result prevents dead code elimination @Benchmark public double measureRight() { return Math.log(x); } ``` JMH automatically consumes the return value of `@Benchmark` methods through an internal `Blackhole`, preventing elimination. Always return your computed result. ### Blackholes for Multiple Results When you produce multiple results, you can only return one. Use `Blackhole.consume()` for the rest: ```java title="Blackhole usage" icon="java" theme={null} @Benchmark public void computeMultiple(Blackhole bh) { bh.consume(Math.log(x)); bh.consume(Math.sqrt(x)); } ``` Import `Blackhole` from `org.openjdk.jmh.infra.Blackhole`. JMH injects it automatically as a method parameter. ### Constant Folding If the JVM can determine a computation's inputs at compile time, it folds the entire computation into a constant: ```java title="Constant folding" icon="java" theme={null} // BAD: the JVM knows wrongX is always Math.PI, result is precomputed private final double wrongX = Math.PI; @Benchmark public double measureWrong() { return Math.log(wrongX); } // GOOD: non-final field prevents constant folding private double x = Math.PI; @Benchmark public double measureRight() { return Math.log(x); } ``` Your IDE may suggest making `x` final. Do not. Non-final `@State` fields are essential for preventing constant folding in benchmarks. ### Do Not Loop Manually Never write manual loops inside benchmark methods. The JVM aggressively optimizes loops. It unrolls, pipelines, and hoists invariant computations out of them, producing unrealistically low per-operation numbers: ```java title="Manual loops" icon="java" theme={null} // BAD: JVM optimizes the loop, results are misleading @Benchmark public int measureWrong() { int sum = 0; for (int i = 0; i < 1000; i++) { sum += compute(i); } return sum; } // GOOD: let JMH control the iteration @Benchmark public int measureRight() { return compute(x); } ``` JMH handles iteration internally with proper safeguards. Trust the framework. ## Best Practices ### Use Multiple Forks The JVM is non-deterministic. Profile-guided optimizations, garbage collection, and thread scheduling vary between runs. A single fork can give misleading results. Use multiple forks (see [`@Fork`](#param-fork)) to capture this variance: ```java title="Fork configuration" icon="java" theme={null} // For development, 1 fork is fine for fast feedback @Fork(1) // For reliable measurements, use 3-5 forks @Fork(5) ``` Each fork starts a fresh JVM, isolating profile-guided optimizations and giving JMH enough data points to compute meaningful confidence intervals. ### Keep Benchmarks Deterministic Use fixed seeds in your [`@Setup`](#param-level-trial) methods for random number generators: ```java title="Deterministic setup" icon="java" theme={null} // BAD: different data every run, results are not reproducible @Setup(Level.Trial) public void setUp() { Random rng = new Random(); // non-deterministic seed // ... } // GOOD: fixed seed, results are reproducible @Setup(Level.Trial) public void setUp() { Random rng = new Random(42); // deterministic seed // ... } ``` ### Verify Correctness Alongside Performance Include assertions in your setup or dedicated test methods to ensure you are benchmarking correct code, not broken code that happens to be fast: ```java title="Correctness check" icon="java" theme={null} @Setup(Level.Trial) public void setUp() { // Verify the algorithm is correct before measuring it if (fibonacci(10) != 55) { throw new IllegalStateException("fibonacci(10) should be 55"); } } ``` ### Use Realistic Data Sorted or regular data can exploit hardware optimizations like branch prediction and cache prefetching, giving misleadingly good results. Use representative data that matches your production workload. ### Benchmark Your Own Code In real projects, organize your benchmarks alongside your source code: The benchmark submodule depends on your library and uses the JMH archetype setup. This keeps benchmark infrastructure separate from production code. ## Running Benchmarks Continuously with CodSpeed So far, you've been running benchmarks locally. But local benchmarking has limitations: * **Inconsistent hardware**: Different developers get different results * **Manual process**: Easy to forget to run benchmarks before merging * **No historical tracking**: Hard to spot gradual performance degradation * **No PR context**: Can't see performance impact during code review This is where **CodSpeed** comes in. It runs your benchmarks automatically in CI and provides: * Automated performance regression detection in PRs * Consistent metrics with reliable measurements across all runs * Historical tracking to see performance over time with detailed charts * Flamegraph profiles to see exactly what changed in your code's execution ### How to Set Up CodSpeed Here's how to integrate CodSpeed with your JMH benchmarks: CodSpeed integrates with JMH through a custom fork. Before configuring CI, follow the [Java integration reference](/benchmarks/java) to add the fork as a Maven or Gradle dependency. Create a workflow file to run benchmarks on every push and pull request. Once the workflow runs, your pull requests will receive a performance report comment: Pull Request Result

After your benchmarks run in CI, head over to your CodSpeed dashboard to see detailed performance reports, historical trends, and flamegraph profiles for deeper analysis. Profiling Report on CodSpeed

Profiling works out of the box, no extra configuration needed! [Learn more about flamegraphs and how to use them to optimize your code](/features/profiling). ## Next Steps Check out these resources to continue your Java benchmarking journey: Sign up and start tracking your Java performance in CI Set up the CodSpeed JMH fork in your Maven or Gradle project Learn how to use flamegraphs to optimize your code Dive into the JMH source and annotation reference