Build a Gin HTTP API, write Golang benchmarks, and run them with CodSpeed in consistent CI environments
This guide shows how to benchmark a Gin-based HTTP API using Go’s testing
package and CodSpeed. We’ll create a minimal API, design clean benchmarks
measuring what matters, and run them in CI with consistent results.
Let’s start with an API from the
official Gin tutorial.
If you never have used Gin before, following this tutorial is a great way to get
started before we start benchmarking.We’ll organize the project so benchmarks target a library package while you
still have a runnable server for manual testing.
api.go
Copy
Ask AI
package apiimport ( "net/http" "github.com/gin-gonic/gin")// album represents data about a record album.type album struct { ID string `json:"id"` Title string `json:"title"` Artist string `json:"artist"` Price float64 `json:"price"`}// albums slice to seed record album data.var albums = []album{ {ID: "1", Title: "Blue Train", Artist: "John Coltrane", Price: 56.99}, {ID: "2", Title: "Jeru", Artist: "Gerry Mulligan", Price: 17.99}, {ID: "3", Title: "Sarah Vaughan and Clifford Brown", Artist: "Sarah Vaughan", Price: 39.99},}func main() { router := gin.Default() router.GET("/albums", getAlbums) router.GET("/albums/:id", getAlbumByID) router.POST("/albums", postAlbums) router.Run("localhost:8080")}// getAlbums responds with the list of all albums as JSON.func getAlbums(c *gin.Context) { c.IndentedJSON(http.StatusOK, albums)}// postAlbums adds an album from JSON received in the request body.func postAlbums(c *gin.Context) { var newAlbum album if err := c.BindJSON(&newAlbum); err != nil { return } albums = append(albums, newAlbum) c.IndentedJSON(http.StatusCreated, newAlbum)}// getAlbumByID returns the album matching the provided id.func getAlbumByID(c *gin.Context) { id := c.Param("id") for _, a := range albums { if a.ID == id { c.IndentedJSON(http.StatusOK, a) return } } c.IndentedJSON(http.StatusNotFound, gin.H{"message": "album not found"})}
The only difference here with the original code is that we’re now using the
api package name instead of main for compatibility reasons.
As a small recap, this small HTTP API handles music albums by storing them in
memory and has three routes:
GET /albums: Returns all albums
GET /albums/:id: Returns a specific album by ID
POST /albums: Creates a new album
Let’s run it to make sure it works:
Copy
Ask AI
$ go mod init github.com/your/repo # initialize the module$ go get github.com/gin-gonic/gin@latest # get the latest version of Gin$ go mod tidy # tidy the module dependencies$ go run api.go # run the server[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. - using env: export GIN_MODE=release - using code: gin.SetMode(gin.ReleaseMode)[GIN-debug] GET /albums --> go-gin-benchmarks-example/api.getAlbums (3 handlers)[GIN-debug] GET /albums/:id --> go-gin-benchmarks-example/api.getAlbumByID (3 handlers)[GIN-debug] POST /albums --> go-gin-benchmarks-example/api.postAlbums (3 handlers)[GIN-debug] Listening and serving HTTP on localhost:8080
Now, let’s get started writing performance tests to actually measure the
performance of each route of this API.First, we need to do a bit of refactoring to make it easier to write benchmarks.
In the initial code, the router is created and configured in the main
function. This is not ideal for any tests or benchmarks because it’s impossible
to reuse the router configuration.Let’s isolate the router creation and configuration in a separate function:
This benchmark creates a router, creates a request and a response recorder, and
then loops over the code to benchmark using b.Loop(), measuring the time it
takes for each iteration.Let’s run it:
Copy
Ask AI
$ go test -bench=.[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. - using env: export GIN_MODE=release - using code: gin.SetMode(gin.ReleaseMode)[GIN-debug] GET /albums --> go-gin-benchmarks-example.getAlbums (3 handlers)[GIN-debug] GET /albums/:id --> go-gin-benchmarks-example.getAlbumByID (3 handlers)[GIN-debug] POST /albums --> go-gin-benchmarks-example.postAlbums (3 handlers)[GIN] 2025/09/19 - 17:56:36 | 200 | 1.959µs | GET "/albums"[GIN] 2025/09/19 - 17:56:36 | 200 | 2.125µs | GET "/albums"[GIN] 2025/09/19 - 17:56:36 | 200 | 2µs | GET "/albums"[GIN] 2025/09/19 - 17:56:36 | 200 | 2.666µs | GET "/albums"... A LOT OF THOSE LINES ...[GIN] 2025/09/19 - 17:56:36 | 200 | 2.084µs | GET "/albums"[GIN] 2025/09/19 - 17:56:36 | 200 | 2µs | GET "/albums"[GIN] 2025/09/19 - 17:56:36 | 200 | 2.042µs | GET "/albums"[GIN] 2025/09/19 - 17:56:36 | 200 | 1.958µs | GET "/albums"goos: darwingoarch: arm64pkg: go-gin-benchmarks-examplecpu: Apple M1 ProBenchmarkGetAlbums-10 54296 21328 ns/opPASSok go-gin-benchmarks-example 1.460s
It works! The first benchmark is running, and the results are displayed.Let’s dive in the numbers:
First we can see in the [GIN] logs that our request takes roughly 2µs on
average. This is an interesting reference point but actually not the source of
truth we’ll use for our benchmark results.
The benchmark name BenchmarkGetAlbums-10 has the -10 suffix, which means
it ran on 10 CPU cores.
It ran 54,296 times, taking an average of 21,328 ns per operation
(which translates to 21.328 µs per request in our case).
Overall, benchmarking this module took 1.460 seconds.
However, seeing the output of this first run, we can note a few things that are
not ideal:
There is a significant overhead in our measurement: we end up measuring ~21µs
per request, but the reported timing of a single request by the router is ~2
µs. That means 90% of what we measure in not what’s happening in the router.
As mentioned in the logs, Gin is running in debug mode here: since we want to
measure something as closely related to what happens in prod, we should run
the router in release mode to measure realistic performance.
The output of the router is very verbose, while it’s very convenient for
integration or unit tests, it’s harmful to have all those logs in the
benchmark since we’re also measuring STDOUT performance here.
To fix those issues, we can create a helper function to set up the router
specifically for benchmarking:
Copy
Ask AI
func setupBenchmarkRouter() *gin.Engine { // Set Gin to release mode for benchmarks gin.SetMode(gin.ReleaseMode) // Discard all output during benchmarks to only preserve benchmark output gin.DefaultWriter = io.Discard return setupRouter()}
And here we go:
Copy
Ask AI
$ go test -bench=.goos: darwingoarch: arm64pkg: go-gin-benchmarks-examplecpu: Apple M1 ProBenchmarkGetAlbums-10 381921 3060 ns/opPASSok go-gin-benchmarks-example 1.495s
And now, we can first see that the output is way cleaner and simpler to
understand what’s going on during the benchmarking process!Also, we can see that the overhead is way lower, and the reported timing
(3.060µs) is much closer to the actual time spent in the router, putting us in a
much better position to make decisions about performance improvements.Still, there’s one last thing we can do to improve the benchmark results.
Since the beginning, we reused the httptest.NewRecorder() inspired by the Gin
testing example to get a response writer. This is necessary because Gin’s
ServeHTTP method requires an http.ResponseWriter to handle the HTTP
response, and httptest.NewRecorder() provides a concrete implementation that
captures the response for inspection.This is convenient but introduces some extra costs not really worth measuring in
our case:
Extra useless allocations
JSON buffering to capture the responses
JSON processing to handle the responses
Let’s replace it with a dummy writer that discards all data:
We’re using the -benchtime=5s flag to run the benchmarks for 5 seconds each,
making sure we get enough samples to get a good estimate of the performance.
And now, almost all branches of the API are covered by this set of benchmarks!
Local benchmarks are excellent for development iteration, but running benchmarks
in CI provides consistency and automation that local benchmarking can’t match:
Consistent hardware: CI runners eliminate the “works on my machine”
problem. Your laptop’s thermal throttling, background processes, and varying
load create noise that masks real performance changes.
Automated detection: Catch performance regressions before they reach
production. Every PR gets benchmarked automatically, making performance a
first-class concern like tests.
Historical tracking: Build a performance timeline across commits. Spot
trends, identify when regressions were introduced, and validate that
optimizations actually worked.
The CodSpeed GitHub Action will automatically run the benchmarks with
instrumentation and upload the results to CodSpeed. It mostly boils down to
this:
.github/workflows/codspeed.yml
Copy
Ask AI
name: CodSpeed Benchmarkson: push: branches: - "main" # or "master" pull_request: # `workflow_dispatch` allows CodSpeed to trigger backtest # performance analysis in order to generate initial data. workflow_dispatch:jobs: benchmarks: name: Run Go benchmarks runs-on: codspeed-macro steps: - uses: actions/checkout@v4 - uses: actions/setup-go@v5 - name: Run the benchmarks uses: CodSpeedHQ/action@v4 with: mode: walltime run: go test -bench=. -benchtime=5s token: ${{ secrets.CODSPEED_TOKEN }} # optional for public repos
Two important things to note:
We’re using the codspeed-macro runners to run the benchmarks on an optimized
and isolated CI machine, removing noise from virtualization and shared
resources. Check out the Macro Runners page for
more details.
We’re again using the -benchtime=5s flag to run the benchmarks for 5 seconds
each, making sure we get enough samples to get a good estimate of the
performance. Feel free to change it to your needs.
This example is for GitHub Actions, but you can use CodSpeed with any other CI
providers. Check out the CI integration docs for more
details on the CI integration.
And now each pull request will automatically run the benchmarks, and you’ll be
able to see the results in the CodSpeed dashboard.For example, let’s use SQLite instead of an in-memory database to see the
performance impact (check out the code
here):
Github Comment on the with the benchmark results
This also emits a status check on the pull request:
Status check on the pull request which can be used to prevent merging regressions
Now, we can analyze the performance report to see the details of the performance
regression:
Performance report with the Differential Flamegraph
Here, we clearly see the dramatic performance regression in getAlbums (in red)
introduced by the new (in blue) database/sql usage.However, we can see that there is almost no difference in the postAlbum
benchmark:
Diving deeper, we can see that actually the changes on the postAlbum function
are important but only impact 0.1% of the total time:
The function is getting 31% slower but only impacts 0.1% of the total time
In the end, using SQLite would impact primarily the read operations without any
impact on the write operations.Check out
the pull request
and
the CodSpeed performance report
for more details.