perf: skip OpenMP prange for small batches
Summary
add a PRANGE_THRESHOLD gate so to_checksum_address_many uses a serial loop for small batches and prange for larger batches in both packed and sequence paths
add inline comments at the loop sites and near the threshold constant to explain the OpenMP-overhead rationale and benchmark provenance
mirror the comments in the generated C output for traceability
Rationale
small batches pay more OpenMP scheduling overhead than they gain from parallelism, so a serial fast path improves performance and clarity for maintainers
Details
introduced a benchmark-driven threshold and explicit serial/prange split
documented the “temporary heuristic” nature of the threshold, noting it may vary with hardware/runtime/OpenMP behavior