Avatar for the OpenMathLib user
OpenMathLib
OpenBLAS
BlogDocsChangelog

Branches performance

Pull requests

fix RVV 1.0 detection code#5432
last run
3 days ago
fix RVV 1.0 detection code There were a couple of issues with the detection code used to check for RVV 1.0 on kernels that do not support hwprobe. 1. The vtype clobber was missing 2. The wrong form of vsetvli was being used. The vsetvli x0, x0 form is inappropriate for this use case as it can only be safely used in code where the value of vtype is known. The use of vsetvli x0, x0 here can lead to a failure to detect RVV 1.0, if, for example, the vill bit happens to be set before detect_riscv64_rvv100 is called. We fix both issues by adding the missing clobber and replacing the first parameter to vsetvli with t0 (which we add to our clobbers).
3 days ago
7fcad02
markdryan:markdyan/fix-rvv-detection
CodSpeed Performance Gauge
0%
disable fp16 flags on RISC-V unless BUILD_HFLOAT16=1 The compiler options that enable 16 bit floating point instructions should not be enabled by default when building the RISCV64_ZVL128B and RISCV64_ZVL256B targets. The zfh and zvfh extensions are not part of the 'V' extension and are not required by any of the RVA profiles. There's no guarantee that kernels built with zfh and zvfh will work correctly on fully compliant RVA23U64 devices. To fix the issue we only build the RISCV64_ZVL128B and RISCV64_ZVL256B kernels with the half float flags if BUILD_HFLOAT16=1. We also update the RISC-V dynamic detection code to disable the RISCV64_ZVL128B and RISCV64_ZVL256B kernels at runtime if we've built with DYNAMIC_ARCH=1 and BUILD_HFLOAT16=1 and are running on a device that does not support both Zfh and Zvfh. Fixes: https://github.com/OpenMathLib/OpenBLAS/issues/5428
4 days ago
ce79fe1
markdryan:markdryan/riscv-hf16-fix
CodSpeed Performance Gauge
0%
Optimize the gemv_t_vector.c kernel for RISCV64_ZVL256B target#5427
last run
9 days ago
riscv64: optimize gemv_t_vector.c
9 days ago
c2cc7a3
yuanjia111:develop
CodSpeed Performance Gauge
0%
Merge remote-tracking branch 'upstream/develop' into issue5417_axpy_sve
10 days ago
e23f9c6
hideaki-motoki:issue5417_axpy_sve
CodSpeed Performance Gauge
0%
© 2025 CodSpeed Technology
Home Terms Privacy Docs