Avatar for the spiraldb user
spiraldb
fastlanes
BlogDocsChangelog

[experimental] Unify VBMI untranspose onto one vpermi2b kernel for all widths

#146
Comparing
claude/dreamy-goodall-vbmi-uniform
(
ec0597e
) with
develop
(
6c10ea7
)
CodSpeed Performance Gauge
0%
Untouched
158
Skipped
123

Benchmarks

281 total
scalar_transpose
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
444 µs444 µs
scalar_untranspose[u64]
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
612.5 µs612.5 µs
dispatch_untranspose[u16]
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
412.4 µs412.4 µs
dispatch_untranspose[u8]
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
424.3 µs424.3 µs
scalar_untranspose[u16]
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
297.5 µs297.5 µs
unpack_single_16_from_3
benches/bitpacking.rs
CodSpeed Performance Gauge
0%
11.6 µs11.6 µs
cmp_fused[u32, 25]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
4.8 µs4.8 µs
cmp_fused[u32, 13]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
4 µs4 µs
dispatch_untranspose[u64]
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
404.6 µs404.6 µs
cmp_fused[u32, 12]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
3.9 µs3.9 µs
cmp_fused[u16, 8]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
2.6 µs2.6 µs
cmp_fused[u32, 18]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
4.4 µs4.4 µs
cmp_fused[u32, 19]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
4.4 µs4.4 µs
cmp_fused[u32, 11]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
3.8 µs3.8 µs
cmp_fused[u16, 12]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
2.5 µs2.5 µs
pack_16_to_3_heap
benches/bitpacking.rs
CodSpeed Performance Gauge
0%
2.4 µs2.4 µs
dispatch_transpose
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
386.4 µs386.4 µs
cmp_fused[u16, 14]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
2.7 µs2.7 µs
cmp_fused[u16, 15]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
2.8 µs2.8 µs
cmp_fused[u16, 1]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
1.7 µs1.7 µs
dispatch_untranspose[u32]
benches/bit_transpose.rs
CodSpeed Performance Gauge
0%
409.2 µs409.2 µs
cmp_fused[u32, 14]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
4 µs4 µs
cmp_fused[u16, 2]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
1.7 µs1.7 µs
cmp_fused[u16, 3]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
1.9 µs1.9 µs
cmp_fused[u32, 15]
benches/bitpacking_cmp.rs::bench
CodSpeed Performance Gauge
0%
4.1 µs4.1 µs

Commits

Click on a commit to change the comparison range
Base
claude/dreamy-goodall-3vpHT
6c10ea7
0%
Unify VBMI untranspose onto a single vpermi2b kernel for all widths
ec0597e
25 days ago
by joseph-isaacs
© 2026 CodSpeed Technology
Home Terms Privacy Docs