Avatar for the napi-rs user
napi-rs
json-escape-simd
BlogDocsChangelog

Performance History

Latest Results

chore: release v3.0.3 (#76) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
main
1 day ago
feat(avx512): working, optimized AVX-512 escape kernel; fix escape_into overflow The avx512 kernel + feature existed but had never run on AVX-512 hardware and was a regression. Enabling only `avx512f` makes LLVM emulate the u8 byte compares (no native vpcmpub), so it ran ~1.75x slower than avx2, and its <64B tail did a 64-byte speculative store that overflowed the destination buffer on short/dense inputs (heap-buffer-overflow / SIGABRT). Kernel (src/simd/avx512.rs): - enable avx512f,avx512bw,avx512vl so LLVM emits vpcmpub + korq + kortestq natively instead of emulating the byte compares - fast path: a single OR-combined mask branch over each 256B chunk (1 branch, the 4 load+mask chains pipeline) instead of 4 short-circuit `&&` branches - tail (<64B): AVX-512 masked load/store (maskz_loadu_epi8 + mask_storeu_epi8, k = (1<<nb)-1) — page-safe load, writes exactly nb bytes, no over-store Dispatch (src/lib.rs): gate the avx512 kernel on runtime `avx512bw && avx512vl` (not just `avx512f`) to match the enabled features. Also fixes a pre-existing soundness hole unrelated to avx512: escape_into never reserved capacity, so a caller buffer sized to the exact output could overflow. Confirmed under ASAN on BOTH the default avx2 path (32-byte write) and the avx512 path (8-byte write via escape_unchecked). Now reserves len*6+32+3 up front (a no-op when the caller already sized dst large enough). Benchmarks on Intel Xeon 8581C (Emerald Rapids), --features avx512: rxjs 249 us (avx2 289 us, sonic-rs 311 us) fixtures 10.31 ms (avx2 11.04 ms, sonic-rs 11.66 ms) short 80 ns (avx2 84 ns, sonic-rs 202 ns) Adds tests/stress.rs: brute-force differential vs serde_json across all lengths (0..=600) x escape densities, plus an escape_into exact-capacity regression. Verified: 16 unit + 3 stress tests pass on both paths; ASAN clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
avx512-support
2 days ago
chore(deps): lock file maintenance
renovate/lock-file-maintenance
2 days ago
chore(deps): update actions/checkout action to v7
renovate/actions-checkout-7.x
15 days ago

Latest Branches

CodSpeed Performance Gauge
0%
feat(avx512): working, optimized AVX-512 escape kernel + escape_into overflow fix#86
2 days ago
530df03
avx512-support
CodSpeed Performance Gauge
0%
2 days ago
a0bf570
renovate/lock-file-maintenance
CodSpeed Performance Gauge
0%
15 days ago
2ec5c78
renovate/actions-checkout-7.x
© 2026 CodSpeed Technology
Home Terms Privacy Docs