yk/proof-result-batching - Branch - paradigmxyz/reth

perf(trie): add batching for storage proof results [WIP]

#19792

Comparing

yk/proof-result-batching

(

a5afac0

) with

main

(

2ade18d

)

Untouched: 81

Benchmarks

Passed

remove_leaf[1000]

crates/trie/sparse/benches/update.rs::benches::remove_leaf

+7%

281.1 µs262.9 µs

hash builder[init size 10000 | update size 100 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

+1%

27.3 ms27 ms

sparse trie[1000]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves::calculate root from leaves

+1%

5.7 ms5.7 ms

remove_leaf[5000]

crates/trie/sparse/benches/update.rs::benches::remove_leaf

+1%

1.2 ms1.1 ms

prefix set | size: 10 | `BTreeSet` with `BTreeSet:range` lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

+1%

5.1 µs5 µs

size 100000 | updated 0.1% | depth 5

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

1.1 ms1.1 ms

hash builder[init size 10000 | update size 1000 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

253.7 ms253.1 ms

sparse trie[init size 1000 | update size 100 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

1.1 ms1.1 ms

prefix set | size: 100 | `BTreeSet` with `BTreeSet:range` lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

40.5 µs40.4 µs

sparse trie[init size 10000 | update size 1000 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

33.9 ms33.8 ms

sparse trie[5000]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves::calculate root from leaves

28.3 ms28.3 ms

hash builder[5000]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves::calculate root from leaves

22.3 ms22.3 ms

size 100000 | updated 1% | depth 4

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

11.4 ms11.3 ms

receipts root | size: 10 | HashBuilder

crates/trie/trie/benches/trie_root.rs::benches::trie_root_benchmark::Receipts root calculation

112 µs111.9 µs

sparse trie[init size 1000 | update size 100 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

3.3 ms3.3 ms

sparse trie[init size 10000 | update size 1000 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

11.7 ms11.6 ms

sparse trie[init size 1000 | update size 1000 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

5.1 ms5.1 ms

sparse trie[init size 1000 | update size 100 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

5.4 ms5.4 ms

hash builder[init size 10000 | update size 100 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

9.6 ms9.6 ms

parallel hashing[100]

crates/trie/trie/benches/hash_post_state.rs::post_state::hash_post_state::Hash Post State

259 ms258.7 ms

sequence hashing[100]

crates/trie/trie/benches/hash_post_state.rs::post_state::hash_post_state::Hash Post State

258.9 ms258.7 ms

sparse trie[init size 1000 | update size 1000 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

24.9 ms24.9 ms

validate_blob | num blobs: 1 | ValidateBlob

crates/primitives/benches/validate_blob_tx.rs::validate_blob::blob_validation::Blob Transaction KZG validation

149.8 µs149.8 µs

sparse trie[init size 10000 | update size 100 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

11.1 ms11.1 ms

prefix set | size: 100 | `BTreeSet` with `Iterator:any` lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

149.3 µs149.3 µs

prefix set | size: 1000 | `Vec` with custom cursor lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

146.2 µs146.1 µs

receipts root | size: 1000 | HashBuilder

crates/trie/trie/benches/trie_root.rs::benches::trie_root_benchmark::Receipts root calculation

9.1 ms9.1 ms

sparse trie[init size 10000 | update size 1000 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

56 ms56 ms

validate_blob | num blobs: 3 | ValidateBlob

crates/primitives/benches/validate_blob_tx.rs::validate_blob::blob_validation::Blob Transaction KZG validation

150.9 µs150.8 µs

sparse trie[init size 10000 | update size 1000 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

111.4 ms111.4 ms

prefix set | size: 1000 | `BTreeSet` with `Iterator:any` lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

14.9 ms14.9 ms

sparse trie[init size 10000 | update size 100 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

6.7 ms6.7 ms

hash builder[init size 10000 | update size 100 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

44 ms44 ms

recover ECDSA

crates/primitives/benches/recover_ecdsa_crit.rs::benches::criterion_benchmark

206.8 µs206.8 µs

prefix set | size: 1000 | `Vec` with binary search lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

205.7 µs205.6 µs

sparse trie[init size 10000 | update size 100 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

2.3 ms2.3 ms

validate_blob | num blobs: 4 | ValidateBlob

crates/primitives/benches/validate_blob_tx.rs::validate_blob::blob_validation::Blob Transaction KZG validation

151.9 µs151.9 µs

hash builder[init size 1000 | update size 1000 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

57.2 ms57.2 ms

sequence hashing[1000]

crates/trie/trie/benches/hash_post_state.rs::post_state::hash_post_state::Hash Post State

2.6 s2.6 s

prefix set | size: 1000 | `BTreeSet` with `BTreeSet:range` lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

473 µs473 µs

prefix set | size: 100 | `Vec` with custom cursor lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

16.7 µs16.7 µs

sparse trie[init size 1000 | update size 100 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

10.8 ms10.8 ms

validate_blob | num blobs: 5 | ValidateBlob

crates/primitives/benches/validate_blob_tx.rs::validate_blob::blob_validation::Blob Transaction KZG validation

153.5 µs153.5 µs

hash builder[init size 10000 | update size 1000 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

26.3 ms26.3 ms

update_leaf[5000]

crates/trie/sparse/benches/update.rs::benches::update_leaf

178.7 µs178.7 µs

ordered_trie_root

crates/trie/trie/benches/trie_root.rs::benches::trie_root_benchmark::Receipts root calculation::receipts root | size: 1000 | triehash

11.8 ms11.8 ms

ordered_trie_root

crates/trie/trie/benches/trie_root.rs::benches::trie_root_benchmark::Receipts root calculation::receipts root | size: 10 | triehash

131.9 µs131.9 µs

sparse trie[init size 10000 | update size 100 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

22 ms22.1 ms

validate_blob | num blobs: 6 | ValidateBlob

crates/primitives/benches/validate_blob_tx.rs::validate_blob::blob_validation::Blob Transaction KZG validation

154.9 µs155 µs

ordered_trie_root

crates/trie/trie/benches/trie_root.rs::benches::trie_root_benchmark::Receipts root calculation::receipts root | size: 100 | triehash

1.2 ms1.2 ms

parallel hashing[1000]

crates/trie/trie/benches/hash_post_state.rs::post_state::hash_post_state::Hash Post State

2.6 s2.6 s

sparse trie[init size 1000 | update size 1000 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

15 ms15 ms

receipts root | size: 100 | HashBuilder

crates/trie/trie/benches/trie_root.rs::benches::trie_root_benchmark::Receipts root calculation

937.4 µs937.9 µs

hash builder[1000]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves::calculate root from leaves

4.5 ms4.5 ms

hash builder[init size 1000 | update size 1000 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

17.3 ms17.3 ms

sparse trie[init size 1000 | update size 1000 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

49.6 ms49.6 ms

hash builder[init size 1000 | update size 1000 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

28.7 ms28.7 ms

hash builder[init size 10000 | update size 1000 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

127.3 ms127.5 ms

hash builder[init size 10000 | update size 100 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

85.9 ms86 ms

hash builder[init size 1000 | update size 100 | num updates 10]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

24.6 ms24.7 ms

prefix set | size: 100 | `Vec` with binary search lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

19.4 µs19.4 µs

validate_blob | num blobs: 2 | ValidateBlob

crates/primitives/benches/validate_blob_tx.rs::validate_blob::blob_validation::Blob Transaction KZG validation

149.6 µs149.9 µs

size 100000 | updated 0.1% | depth 4

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

1.4 ms1.4 ms

hash builder[init size 1000 | update size 100 | num updates 5]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

12.4 ms12.4 ms

hash builder[init size 10000 | update size 1000 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

77.3 ms77.5 ms

hash builder[init size 1000 | update size 1000 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

5.8 ms5.8 ms

size 100000 | updated 0.1% | depth 3

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

2.5 ms2.5 ms

size 100000 | updated 1% | depth 2

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

24.5 ms24.7 ms

size 100000 | updated 1% | depth 0

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

24.7 ms24.8 ms

size 100000 | updated 1% | depth 1

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

24.7 ms24.9 ms

prefix set | size: 10 | `Vec` with binary search lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

-1%

3.8 µs3.9 µs

size 100000 | updated 1% | depth 3

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

21.7 ms21.9 ms

prefix set | size: 10 | `Vec` with custom cursor lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

-1%

3.2 µs3.3 µs

size 100000 | updated 0.1% | depth 1

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

3.6 ms3.6 ms

hash builder[init size 1000 | update size 100 | num updates 3]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

-1%

7.5 ms7.5 ms

hash builder[init size 1000 | update size 100 | num updates 1]

crates/trie/sparse/benches/root.rs::root::calculate_root_from_leaves_repeated::calculate root from leaves repeated

-1%

2.5 ms2.5 ms

size 100000 | updated 0.1% | depth 0

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

3.6 ms3.6 ms

update_leaf[1000]

crates/trie/sparse/benches/update.rs::benches::update_leaf

-1%

112.2 µs113.4 µs

size 100000 | updated 0.1% | depth 2

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

3.4 ms3.4 ms

size 100000 | updated 1% | depth 5

crates/trie/sparse/benches/rlp_node.rs::rlp_node::update_rlp_node_level::update rlp node level

-1%

8.2 ms8.3 ms

prefix set | size: 10 | `BTreeSet` with `Iterator:any` lookup

crates/trie/common/benches/prefix_set.rs::prefix_set::prefix_set_lookups::Prefix Set Lookups

-2%

3.8 µs3.9 µs

Commits

Click on a commit to change the comparison range

Base

main

2ade18d

+0.13%

perf(trie): add adaptive batching for storage proof results Problem: Storage proof workers send one ProofResultMessage per proof through crossbeam channels. For blocks with many small storage changes (100+ accounts), this creates 100+ individual send/recv syscalls, adding significant overhead. Solution: Implement adaptive batching at the worker level that collects multiple storage proof jobs based on queue pressure and processes them together. Changes: - Add batching constants (MAX_BATCH_SIZE: 32, MIN_QUEUE_FOR_BATCHING: 2) - Add BatchedProofResults type for batched proof containers - Implement try_collect_batch_static() for adaptive batch collection - Modify StorageProofWorker::run() to use batching when beneficial - Add process_storage_proof_batch() for batch processing - Preserve individual channels to maintain existing architecture Batching Strategy: - Queue depth < 2: Process individually (minimize latency) - Queue depth 2-32: Batch = queue depth (balanced) - Queue depth > 32: Batch = 32 (maximize throughput) - Blinded node requests: Never batch (latency-sensitive) Expected Impact: - 70%+ reduction in channel syscalls under high load - No latency regression for low-load scenarios - Better CPU cache utilization through sequential processing Baseline: 100 storage proofs = 200 syscalls (100 sends + 100 recvs) With batching (avg batch size 4-8): ~30-50 syscalls = 75-85% reduction

f6f41c9

23 hours ago

by yongkangc

+0.12%

fix(trie): address code review issues in proof batching Fixed critical bugs and issues identified in code review: 1. **Critical: Fix job loss bug for BlindedStorageNode requests** - Previously, blinded node jobs were silently dropped during batch collection, causing indefinite hangs - Now properly includes all job types in batch and separates them for appropriate processing 2. **Remove unused BatchedProofResults struct** - Struct was defined but never instantiated or used - Removed to eliminate dead code and confusion 3. **Fix misleading documentation** - Clarified that batching optimizes job *processing*, not result sending - Updated docs to accurately reflect performance benefits: * Reduced recv() syscalls on work queue * Better CPU cache locality * Reduced context switching overhead - Removed unsubstantiated claims about channel syscall reduction 4. **Improve job handling** - Worker loop now properly separates storage proofs from blinded nodes - Storage proofs batched for cache benefits when multiple available - Blinded node requests always processed individually (latency-sensitive) 5. **Code quality improvements** - Added accurate inline documentation - Fixed move-after-use error with job type checking - Ensured all jobs are processed, never lost Changes address all issues raised in PR review including: - Job loss bug (P0) - Dead code (P1) - Misleading performance claims (P1) - Proper mixed job type handling (P1) All clippy checks pass with -D warnings.

882bea7

16 hours ago

by yongkangc

-0.05%

refactor: simplify proof result batching implementation Simplifies the batching implementation based on code review feedback: - Remove unnecessary performance comments from documentation - Streamline function documentation to focus on behavior - Fix unused worker_id parameter warning - Keep code clear and concise All fmt and clippy checks pass.

46c4bc2

15 hours ago

by yongkangc

-0.24%

fix(trie): address code review issues in proof batching **Problem:** Code review identified several issues with the batching implementation: - Redundant pattern matching (jobs classified by type twice) - Unnecessary process_storage_proof_batch wrapper function - Unused worker_id parameter in try_collect_batch_static **Solution:** Simplify the implementation by removing redundancy: - Destructure jobs immediately when separating by type - Remove process_storage_proof_batch function (just inline the loop) - Remove unused worker_id parameter **Changes:** - Worker loop now destructures jobs in first match, eliminating redundant pattern checks - Removed 23 lines of dead/redundant code - Cleaner, more direct implementation **Expected Impact:** No functional changes, just cleaner code that's easier to understand and maintain.

a5afac0

12 hours ago

by yongkangc

Home Terms Privacy Docs