Avatar for the OpenMathLib user
OpenMathLib
OpenBLAS
BlogDocsChangelog

Work around gcc15.1 on POWER misoptimizing DGEMV at -O3

#5409Merged
Comparing
martin-frbg:issue5372
(
a3b9c93
) with
develop
(
9a64b32
)
CodSpeed Performance Gauge
0%
Untouched
62

Benchmarks

62 total
test_dot[1000]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
28.4 µs28.2 µs
test_dgbmv[1-100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
37.9 µs37.7 µs
test_dgemv[100-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
103.7 µs103.2 µs
test_dgemv[100-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
149.7 µs148.9 µs
test_daxpy[100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
24 µs23.9 µs
test_daxpy[100-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
23.9 µs23.8 µs
test_dgemv[100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
141 µs140.4 µs
test_dgbmv[1-100-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
37.4 µs37.2 µs
test_daxpy[100-z]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
25.8 µs25.6 µs
test_daxpy[1000-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
27.4 µs27.3 µs
test_gesv[100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
395.5 µs394 µs
test_nrm2[100-dz]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
28.8 µs28.7 µs
test_nrm2[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
30.5 µs30.4 µs
test_dgbmv[1-1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
83.7 µs83.4 µs
test_dgbmv[1-1000-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
75.1 µs75 µs
test_nrm2[100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
36.7 µs36.7 µs
test_daxpy[1000-z]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
40.5 µs40.4 µs
test_dot[100]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
22.3 µs22.2 µs
test_gesdd[mn0-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
120 µs119.9 µs
test_dgbmv[1-100-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
40.3 µs40.2 µs
test_daxpy[100-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
25 µs24.9 µs
test_dgbmv[1-1000-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
99.5 µs99.4 µs
test_daxpy[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
32.3 µs32.3 µs
test_dgbmv[1-100-z]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
42.1 µs42.1 µs
test_gesv[100-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
256.8 µs256.7 µs

Commits

Click on a commit to change the comparison range
Base
develop
9a64b32
+0.12%
mark xbuffer as volatile to work around gcc15.1 optimizer bug
a3b9c93
10 months ago
by martin-frbg
© 2026 CodSpeed Technology
Home Terms Privacy Docs