Avatar for the vortex-data user
vortex-data
vortex
BlogDocsChangelog

fix: build CUDA kernels as multi-arch fatbin with PTX fallback

#8047Merged
Comparing
fix-cuda-ptx-gpu-invalidation
(
b890192
) with
develop
(
a8fb30e
)
CodSpeed Performance Gauge
+24%
Improvement
4
Untouched
1233

Benchmarks

1237 total
fast_eq_out_of_range[4, 65536]
encodings/fastlanes/benches/bitpack_compare.rs
CodSpeed Performance Gauge
+31%
246.3 µs188.5 µs
fast_eq_out_of_range[16, 65536]
encodings/fastlanes/benches/bitpack_compare.rs
CodSpeed Performance Gauge
+25%
291.3 µs233.9 µs
fast_lt_out_of_range[16, 65536]
encodings/fastlanes/benches/bitpack_compare.rs
CodSpeed Performance Gauge
+23%
306.6 µs248.8 µs
chunked_varbinview_opt_canonical_into[(1000, 10)]
vortex-array/benches/chunk_array_builder.rs
CodSpeed Performance Gauge
+20%
225.1 µs187.9 µs
iter_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+7%
919.4 ns861.1 ns
set_indices_arrow_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+6%
1,010.3 ns951.9 ns
slice_vortex_buffer[128]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.3 µs1.2 µs
slice_vortex_buffer[2048]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.3 µs1.2 µs
slice_vortex_buffer[65536]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.3 µs1.2 µs
slice_vortex_buffer[1024]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.3 µs1.2 µs
slice_vortex_buffer[16384]
vortex-buffer/benches/vortex_bitbuffer.rs
CodSpeed Performance Gauge
+5%
1.3 µs1.2 µs
encode_varbinview[(10000, 8)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+4%
1.1 ms1 ms
chunked_dict_fsst_into_canonical[(1000, 100, 100)]
encodings/fsst/benches/chunked_dict_fsst_builder.rs
CodSpeed Performance Gauge
+3%
14 ms13.7 ms
chunked_dict_fsst_canonical_into[(1000, 100, 100)]
encodings/fsst/benches/chunked_dict_fsst_builder.rs
CodSpeed Performance Gauge
+3%
14 ms13.6 ms
chunked_dict_fsst_into_canonical[(1000, 1000, 100)]
encodings/fsst/benches/chunked_dict_fsst_builder.rs
CodSpeed Performance Gauge
+3%
14.9 ms14.5 ms
chunked_dict_fsst_into_canonical[(1000, 100, 10)]
encodings/fsst/benches/chunked_dict_fsst_builder.rs
CodSpeed Performance Gauge
+3%
1.5 ms1.4 ms
patched_take_10_contiguous
encodings/fastlanes/benches/bitpacking_take.rs
CodSpeed Performance Gauge
+3%
31.1 µs30.4 µs
chunked_dict_fsst_canonical_into[(1000, 10, 100)]
encodings/fsst/benches/chunked_dict_fsst_builder.rs
CodSpeed Performance Gauge
+2%
13.9 ms13.6 ms
decode_primitives[f32, (1000, 512)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+2%
18.7 µs18.2 µs
encode_varbin[(1000, 32)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+2%
171.2 µs167.3 µs
chunked_dict_fsst_canonical_into[(1000, 1000, 100)]
encodings/fsst/benches/chunked_dict_fsst_builder.rs
CodSpeed Performance Gauge
+2%
14.9 ms14.5 ms
encode_varbin[(1000, 8)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+2%
165.6 µs162.1 µs
fast_lt_out_of_range[4, 65536]
encodings/fastlanes/benches/bitpack_compare.rs
CodSpeed Performance Gauge
+2%
262.3 µs257.2 µs
dict_canonicalize_zipfian[16, 1000]
vortex-array/benches/take_primitive.rs
CodSpeed Performance Gauge
+2%
47.8 µs47 µs
push_n_vortex_buffer[u32, 128]
vortex-buffer/benches/vortex_buffer.rs
CodSpeed Performance Gauge
+2%
1.7 µs1.7 µs

Commits

Click on a commit to change the comparison range
Base
develop
a8fb30e
+24.49%
compile cuda kernels as fatbin
b890192
13 hours ago
by 0ax1
© 2026 CodSpeed Technology
Home Terms Privacy Docs