Avatar for the vortex-data user
vortex-data
vortex
BlogDocsChangelog

bitunpacking cuda kernels store output into shared memory before copying to main memory

#6384Merged
Comparing
rk/fasterbitpack
(
d196877
) with
develop
(
3cb7fab
)
CodSpeed Performance Gauge
-13%
Improvement
1
Regression
3
Untouched
1134
Skipped
1265

Benchmarks

2403 total
true_count_vortex_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
-13%
984.7 ns1,130.6 ns
true_count_vortex_buffer[1024]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
-12%
1.1 µs1.2 µs
true_count_vortex_buffer[2048]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
-10%
1.2 µs1.4 µs
true_count_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+10%
946.9 ns859.4 ns
true_count_arrow_buffer[1024]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+6%
980.3 ns921.9 ns
bitwise_not_vortex_buffer_mut[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+6%
530.3 ns501.1 ns
true_count_arrow_buffer[2048]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.1 µs1.1 µs
set_indices_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.2 µs1.2 µs
bitwise_not_vortex_buffer_mut[1024]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+4%
681.9 ns652.8 ns
binary_search_std
vortex-array/benches/search_sorted.rs
CodSpeed Performance Gauge
+4%
712.2 ns683.1 ns
bitwise_not_vortex_buffer_mut[2048]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+4%
846.9 ns817.8 ns
binary_search_vortex
vortex-array/benches/search_sorted.rs
CodSpeed Performance Gauge
+3%
909.7 ns880.6 ns
decode_primitives[f32, (1000, 4)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+3%
24.7 µs24 µs
value_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+3%
1.1 µs1.1 µs
null_count_run_end[(10000, 256, 0.01)]
encodings/runend/benches/run_end_null_count.rs
CodSpeed Performance Gauge
+3%
3.4 µs3.3 µs
null_count_run_end[(100000, 1024, 0.01)]
encodings/runend/benches/run_end_null_count.rs
CodSpeed Performance Gauge
+3%
3.4 µs3.3 µs
null_count_run_end[(10000, 1024, 0.01)]
encodings/runend/benches/run_end_null_count.rs
CodSpeed Performance Gauge
+3%
3.4 µs3.3 µs
decode_primitives[f32, (1000, 32)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+3%
25.6 µs24.9 µs
append_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+2%
1.5 µs1.4 µs
take_map[(0.05, 0.05)]
vortex-array/benches/take_patches.rs
CodSpeed Performance Gauge
+2%
462.8 µs454.6 µs
true_count_arrow_buffer[16384]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+2%
3.5 µs3.5 µs
expand_buffer[u32, (256, 0.1)]
vortex-compute/benches/expand_buffer.rs
CodSpeed Performance Gauge
+2%
3.7 µs3.7 µs
expand_buffer[u32, (256, 0.5)]
vortex-compute/benches/expand_buffer.rs
CodSpeed Performance Gauge
+1%
4.1 µs4 µs
append_buffer_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+1%
4.5 µs4.4 µs
take_map[(0.1, 0.5)]
vortex-array/benches/take_patches.rs
CodSpeed Performance Gauge
+1%
2.1 ms2.1 ms

Commits

Click on a commit to change the comparison range
Base
develop
3cb7fab
+0.01%
bitunpacking cuda kernels store output into shared memory before copying to main memory
af6471f
13 days ago
by robert3005
-26.12%
more
172b510
13 days ago
by robert3005
+13.22%
better?
d196877
13 days ago
by robert3005
© 2026 CodSpeed Technology
Home Terms Privacy Docs