OpenMathLib
OpenBLAS
BlogDocsChangelog

Small gemm kernel improvements for AArch64

#5091Merged
Comparing
goplanid:develop
(
d1bfa97
) with
develop
(
518e376
)
CodSpeed Performance Gauge
0%
Untouched
62

Benchmarks

62 total
test_dgemv[100-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
148.2 µs147.3 µs
test_dgemv[100-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
108.1 µs107.5 µs
test_dgemv[100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
+1%
145.3 µs144.6 µs
test_dot[1000]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
27.3 µs27.3 µs
test_daxpy[1000-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
26.4 µs26.3 µs
test_daxpy[100-z]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
24.8 µs24.8 µs
test_gesdd[mn0-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
115.6 µs115.5 µs
test_gesv[100-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
394.4 µs394.2 µs
test_gesv[100-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
692.7 µs692.4 µs
test_dgemv[1000-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
7.7 ms7.7 ms
test_gesv[100-z]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
934.7 µs934.4 µs
test_gesdd[mn1-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
93.7 ms93.7 ms
test_daxpy[1000-z]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
39.6 µs39.6 µs
test_gesv[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
93.3 ms93.3 ms
test_daxpy[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
31 µs31 µs
test_gesv[1000-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
188.6 ms188.6 ms
test_gemm[100-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
658.5 µs658.5 µs
test_dgemv[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
14.7 ms14.7 ms
test_syev[200-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
58.5 ms58.5 ms
test_gesdd[mn1-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
65 ms65 ms
test_syev[50-s]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
1.3 ms1.3 ms
test_syev[50-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
1.4 ms1.4 ms
test_gemm[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
239.4 ms239.4 ms
test_syrk[1000-d]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
130.4 ms130.4 ms
test_syrk[1000-c]
benchmark/pybench/benchmarks/bench_blas.py
CodSpeed Performance Gauge
0%
227.5 ms227.5 ms

Commits

Click on a commit to change the comparison range
Base
develop
518e376
-0.05%
small gemm kernel packing modifications
d1bfa97
1 year ago
by goplanid
© 2026 CodSpeed Technology
Home Terms Privacy Docs