napi-rs/json-escape-simd - CodSpeed

json-escape-simd

Blog Docs Changelog

Performance History

Latest Results

chore: release v3.0.3 (#76) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

main

1 day ago

feat(avx512): working, optimized AVX-512 escape kernel + escape_into overflow fix (#86) Enable avx512f,avx512bw,avx512vl on the kernel (native vpcmpub) so it actually beats avx2; OR-combined single-branch 256B fast path; AVX-512 masked load/store tail (no over-store). Runtime dispatch gated on avx512bw+avx512vl. Fixes a pre-existing escape_into exact-capacity overflow via up-front reserve. Adds brute-force differential stress tests vs serde_json.

main

2 days ago

test(stress): satisfy clippy manual_repeat_n/manual_str_repeat CI `cargo clippy --all-targets --all-features -- -D warnings` (rust 1.96) rejected the `repeat(x).take(n)` string builders in the stress test. Switch to `str::repeat` for literal fills and `iter::repeat_n` for the char-variable fills. MSRV-safe (repeat_n stable since 1.82; crate MSRV 1.85). Behavior unchanged; all 3 stress tests still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avx512-support

2 days ago

feat(avx512): working, optimized AVX-512 escape kernel; fix escape_into overflow The avx512 kernel + feature existed but had never run on AVX-512 hardware and was a regression. Enabling only `avx512f` makes LLVM emulate the u8 byte compares (no native vpcmpub), so it ran ~1.75x slower than avx2, and its <64B tail did a 64-byte speculative store that overflowed the destination buffer on short/dense inputs (heap-buffer-overflow / SIGABRT). Kernel (src/simd/avx512.rs): - enable avx512f,avx512bw,avx512vl so LLVM emits vpcmpub + korq + kortestq natively instead of emulating the byte compares - fast path: a single OR-combined mask branch over each 256B chunk (1 branch, the 4 load+mask chains pipeline) instead of 4 short-circuit `&&` branches - tail (<64B): AVX-512 masked load/store (maskz_loadu_epi8 + mask_storeu_epi8, k = (1<<nb)-1) — page-safe load, writes exactly nb bytes, no over-store Dispatch (src/lib.rs): gate the avx512 kernel on runtime `avx512bw && avx512vl` (not just `avx512f`) to match the enabled features. Also fixes a pre-existing soundness hole unrelated to avx512: escape_into never reserved capacity, so a caller buffer sized to the exact output could overflow. Confirmed under ASAN on BOTH the default avx2 path (32-byte write) and the avx512 path (8-byte write via escape_unchecked). Now reserves len*6+32+3 up front (a no-op when the caller already sized dst large enough). Benchmarks on Intel Xeon 8581C (Emerald Rapids), --features avx512: rxjs 249 us (avx2 289 us, sonic-rs 311 us) fixtures 10.31 ms (avx2 11.04 ms, sonic-rs 11.66 ms) short 80 ns (avx2 84 ns, sonic-rs 202 ns) Adds tests/stress.rs: brute-force differential vs serde_json across all lengths (0..=600) x escape densities, plus an escape_into exact-capacity regression. Verified: 16 unit + 3 stress tests pass on both paths; ASAN clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avx512-support

2 days ago

chore(deps): lock file maintenance (#85)

main

2 days ago

chore(deps): lock file maintenance

renovate/lock-file-maintenance

2 days ago

chore(deps): update actions/checkout action to v7 (#84) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

main

8 days ago

chore(deps): update actions/checkout action to v7

renovate/actions-checkout-7.x

15 days ago

Latest Branches

0%

feat(avx512): working, optimized AVX-512 escape kernel + escape_into overflow fix#86

2 days ago

530df03

avx512-support

0%

chore(deps): lock file maintenance#85

2 days ago

a0bf570

renovate/lock-file-maintenance

0%

chore(deps): update actions/checkout action to v7#84

15 days ago

2ec5c78

renovate/actions-checkout-7.x

© 2026 CodSpeed Technology

Home Terms Privacy Docs