Commits
Click on a commit to change the comparison rangeperf(allocator/vec2): optimize reserving memory (#9792)
resolve: https://github.com/oxc-project/oxc/pull/9656#issuecomment-2723678880
#9656 brought a small performance improvement for the transformer but also led the parser to 1% performance hits. This PR returns performance by splitting `reserve_internal` to `reserve_exact_internal` and `reserve_amortized_internal` respectively the internal implementation of `reserve_exact` and `reserve`.
Why the change can improve performance?
The original `reserve_internal` implementation has a check for reserve strategy, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L664-L668 which is can be avoided because the caller of `reserve_internal` already knows the reserve strategy. After the change, the `reserve_exact` and `reserve` can call the corresponding internal implementation directly, which can avoid unnecessary checks.
Likewise, the `Fallibility` check can also be avoided, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L681-L683
because we know where the errors should be handled.
~~Due to this change, I also replaced Bumpalo's `CollecitonAllocErr` with allocator-api2's `TryReserveError` because `CollecitonAllocErr::AllocErr` cannot pass in a `Layout`.~~ I ended up reverting https://github.com/oxc-project/oxc/pull/9792/commits/937c61a65815f84d6998764e77b1aabac4a8dc71 as it caused transformer performance 1%-2% regression (See [codspeed](https://codspeed.io/oxc-project/oxc/branches/03-15-pref_allocator_vec2_optimize_reserving_memory) and switch to "replace CollectionAllocErr with TryReserveError" commit), and replaced by https://github.com/oxc-project/oxc/pull/9792/commits/84edacdcc131e1891a02e9f5862cbf52da71a355 I've tried various way to save the performance but it not work. I suspect the cause is that `TryReserveError` is 16 bytes whereas `CollecitonAllocErr` is only 1 byte.
So, after both checks are removed, the performance returns to the original. The whole change is according to standard `RawVec`'s implementation. See https://doc.rust-lang.org/src/alloc/raw_vec.rs.html
<img width="608" alt="image" src="https://github.com/user-attachments/assets/53066d8e-26f0-4eb1-8f33-4ca9e517e75b" />