Avatar for the vortex-data user
vortex-data
vortex
BlogDocsChangelog

Add Mojo AOT-compiled SIMD take/filter kernels for primitive arrays

#7387
Comparing
claude/plan-mojo-simd-kernels-IDywB
(
5f2a781
) with
develop
(
8d9052e
)
CodSpeed Performance Gauge
+82%
Improvement
34
Untouched
1088
Skipped
1455

Benchmarks

1455 total
decompress[u32, (1000000, 8192)]
encodings/runend/benches/run_end_compress.rs
Skipped
3.8 MB*
compress[(1000000, 256)]
encodings/runend/benches/run_end_compress.rs
Skipped
120 KB*
decompress[u64, (1000, 4)]
encodings/runend/benches/run_end_compress.rs
Skipped
8 KB*
take_indices[(1000, 4)]
encodings/runend/benches/run_end_compress.rs
Skipped
2.8 KB*
filter_runend[(1000, 4, 0.03)]
encodings/runend/benches/run_end_filter.rs
Skipped
3 KB*
decompress[u64, (10000, 256)]
encodings/runend/benches/run_end_compress.rs
Skipped
78.2 KB*
take_indices[(1000, 16, 0.005)]
encodings/runend/benches/run_end_filter.rs
Skipped
1.1 KB*
decompress[u8, (1000, 4)]
encodings/runend/benches/run_end_compress.rs
Skipped
1.2 KB*
decompress[u16, (1000000, 4096)]
encodings/runend/benches/run_end_compress.rs
Skipped
1.9 MB*
null_count_run_end[(100000, 16, 0.1)]
encodings/runend/benches/run_end_null_count.rs
Skipped
12.9 KB*
decompress[u8, (100000, 4)]
encodings/runend/benches/run_end_compress.rs
Skipped
97.9 KB*
decompress[u32, (100000, 16)]
encodings/runend/benches/run_end_compress.rs
Skipped
390.8 KB*
decompress[u16, (1000000, 1024)]
encodings/runend/benches/run_end_compress.rs
Skipped
1.9 MB*
null_count_run_end[(10000, 16, 0.01)]
encodings/runend/benches/run_end_null_count.rs
Skipped
1.9 KB*
pco_canonical[(10000, 0.5)]
encodings/pco/benches/pco.rs
Skipped
128 B*
decompress[u16, (100000, 4096)]
encodings/runend/benches/run_end_compress.rs
Skipped
192.2 KB*
decompress[u8, (1000000, 256)]
encodings/runend/benches/run_end_compress.rs
Skipped
976.7 KB*
decompress[u32, (10000, 1024)]
encodings/runend/benches/run_end_compress.rs
Skipped
36.2 KB*
decompress[u8, (1000000, 4)]
encodings/runend/benches/run_end_compress.rs
Skipped
976.8 KB*
decompress[u64, (10000, 4)]
encodings/runend/benches/run_end_compress.rs
Skipped
78.3 KB*
pco_canonical[(50000, 0.1)]
encodings/pco/benches/pco.rs
Skipped
128 B*
take_indices[(1000, 256, 0.005)]
encodings/runend/benches/run_end_filter.rs
Skipped
1 KB*
decompress[u32, (100000, 256)]
encodings/runend/benches/run_end_compress.rs
Skipped
390.2 KB*
pco_canonical[(50000, 0.9)]
encodings/pco/benches/pco.rs
Skipped
128 B*
pco_canonical[(50000, 0.5)]
encodings/pco/benches/pco.rs
Skipped
128 B*

Commits

Click on a commit to change the comparison range
Base
develop
8d9052e
0%
Install Mojo SDK in codspeed benchmark CI for vortex-array
6bfda92
5 days ago
by claude
-0.02%
Fix SIGILL in CI: pin Mojo target to x86-64-v3 (AVX2)
59bb1ea
5 days ago
by claude
-14.68%
Fix nightly rustfmt: split grouped imports, reorder super:: imports
64fdd36
5 days ago
by claude
+14.63%
Deprioritize Mojo below AVX2 on x86_64 in take dispatch
457d81a
5 days ago
by claude
+0.01%
Fix Mojo build in Cargo: pass --target-triple from TARGET env var
2afb139
5 days ago
by claude
0%
Optimize Mojo gather: 4x unroll + skylake target for vpgatherqd
f14c2f8
5 days ago
by claude
-0.01%
Add --mtune to Mojo build for better instruction scheduling
d60f190
5 days ago
by claude
+48.81%
Promote Mojo to top-priority take kernel when available
0d0fd77
5 days ago
by claude
+2.47%
Merge develop, resolve conflicts in Cargo.toml and take/mod.rs
900985a
5 days ago
by claude
-0.05%
Add scalar baseline to runend decode benchmark for codspeed comparison
f2f14b4
5 days ago
by claude
+31.42%
Support u64 ends in Mojo runend decode to hit existing benchmarks
d51800a
5 days ago
by claude
-1.27%
Clean up PR: split kernels per crate, remove unnecessary benchmarks
7ec0e46
5 days ago
by claude
+0.2%
Fix lint: use #[allow(unused)] not #[expect(unused)] for TakeKernelScalar
4d7e86d
5 days ago
by claude
0%
Fix Mojo build on macOS: skip --mcpu=native and --target-triple on Apple targets
5f2a781
5 days ago
by joseph-isaacs
© 2026 CodSpeed Technology
Home Terms Privacy Docs