Avatar for the vortex-data user
vortex-data
vortex
BlogDocsChangelog

Add Mojo AOT-compiled SIMD take/filter kernels for primitive arrays

#7387
Comparing
claude/plan-mojo-simd-kernels-IDywB
(
5f2a781
) with
develop
(
8d9052e
)
CodSpeed Performance Gauge
+82%
Improvement
34
Untouched
1088
Skipped
1455

Benchmarks

2577 total
decompress[u8, (100000, 4096)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+82%
116.9 µs64.4 µs
decompress[u16, (100000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+65%
936.9 µs568.7 µs
decompress[u16, (100000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+64%
317.7 µs194.3 µs
decompress[u64, (100000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+57%
1,257.7 µs799.3 µs
decompress[u16, (10000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+52%
108.1 µs71.2 µs
decompress[u8, (100000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+52%
816.6 µs538.7 µs
decompress[u32, (100000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+51%
1,030 µs682.6 µs
decompress[u64, (10000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+49%
139.8 µs93.9 µs
decode_primitives[u8, (10000, 8)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+47%
53.5 µs36.3 µs
decode_primitives[u8, (10000, 2)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+47%
53.5 µs36.3 µs
decode_primitives[u8, (10000, 512)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+47%
54 µs36.6 µs
decode_primitives[u8, (10000, 32)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+47%
53.6 µs36.5 µs
decode_primitives[u8, (10000, 4)]
vortex-array/benches/dict_compress.rs
CodSpeed Performance Gauge
+46%
53.5 µs36.7 µs
decompress[u32, (100000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+42%
432.3 µs303.6 µs
take_indices[(100000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+42%
1,364.8 µs958.8 µs
decompress[u32, (10000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+42%
117.1 µs82.5 µs
decompress[u8, (10000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+41%
96.1 µs68.2 µs
decompress[u8, (100000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+40%
243.6 µs173.9 µs
decompress[u16, (10000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+37%
46.3 µs33.8 µs
decompress[u32, (10000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+29%
57.8 µs44.8 µs
varbinview_zip_block_mask
vortex-array/benches/varbinview_zip.rs
CodSpeed Performance Gauge
+28%
3.7 ms2.9 ms
take_indices[(100000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+27%
669.5 µs526.7 µs
decompress[u64, (100000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+23%
609.9 µs496.9 µs
decompress[u8, (10000, 16)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+23%
38.9 µs31.7 µs
decompress[u64, (1000, 4)]
encodings/runend/benches/run_end_compress.rs
CodSpeed Performance Gauge
+20%
28.6 µs23.9 µs

Commits

Click on a commit to change the comparison range
Base
develop
8d9052e
0%
Install Mojo SDK in codspeed benchmark CI for vortex-array
6bfda92
5 days ago
by claude
-0.02%
Fix SIGILL in CI: pin Mojo target to x86-64-v3 (AVX2)
59bb1ea
5 days ago
by claude
-14.68%
Fix nightly rustfmt: split grouped imports, reorder super:: imports
64fdd36
5 days ago
by claude
+14.63%
Deprioritize Mojo below AVX2 on x86_64 in take dispatch
457d81a
5 days ago
by claude
+0.01%
Fix Mojo build in Cargo: pass --target-triple from TARGET env var
2afb139
5 days ago
by claude
0%
Optimize Mojo gather: 4x unroll + skylake target for vpgatherqd
f14c2f8
5 days ago
by claude
-0.01%
Add --mtune to Mojo build for better instruction scheduling
d60f190
5 days ago
by claude
+48.81%
Promote Mojo to top-priority take kernel when available
0d0fd77
5 days ago
by claude
+2.47%
Merge develop, resolve conflicts in Cargo.toml and take/mod.rs
900985a
5 days ago
by claude
-0.05%
Add scalar baseline to runend decode benchmark for codspeed comparison
f2f14b4
5 days ago
by claude
+31.42%
Support u64 ends in Mojo runend decode to hit existing benchmarks
d51800a
5 days ago
by claude
-1.27%
Clean up PR: split kernels per crate, remove unnecessary benchmarks
7ec0e46
5 days ago
by claude
+0.2%
Fix lint: use #[allow(unused)] not #[expect(unused)] for TakeKernelScalar
4d7e86d
5 days ago
by claude
0%
Fix Mojo build on macOS: skip --mcpu=native and --target-triple on Apple targets
5f2a781
5 days ago
by joseph-isaacs
© 2026 CodSpeed Technology
Home Terms Privacy Docs