Eventual-Inc
Daft
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
perf(inline-agg): add Product accumulator type (#6975) ## Summary Implements the Product accumulator from #6585 (item 7) for the inline grouped aggregation path. Mirrors Sum's shape: widens Int8/16/32 ā Int64, UInt8/16/32 ā UInt64, keeps Float32/Float64 native. Reduce op is `*` instead of `+`; the per-group `Option<T>` state and `init_groups` / `update_batch` / `finalize` methods are basically the same to `SumAccum*`. Overflow semantics match Sum exactly: native `*` (wraps silently in release mode), consistent with the fallback `Series::product()` implementation. Grouping semantics and final query results are unchanged. ## Why `AggExpr::Product` already exists in the DSL and is wired in the fallback path (`src/daft-recordbatch/src/lib.rs`), but currently falls back to `make_groups + eval_agg_expression` even when the rest of the query qualifies for the inline path. Adding Product to the inline accumulator framework completes coverage of the standard reducer-style aggregates (Count / Sum / Min / Max / Product) that all share the same `Vec<Option<T>>` per-group state shape. - Closes the Product after #6604 added Min/Max. ## Changes Made - `src/daft-recordbatch/src/ops/inline_agg.rs`: - New `define_product_accum!` macro generating `ProductAccumI64` / `ProductAccumU64` / `ProductAccumF32` / `ProductAccumF64`, byte-identical to `define_sum_accum!` except the reduce op (`*` vs `+`). - `define_agg_accumulator_enum!` extended with `ProductI64` / `ProductU64` / `ProductF32` / `ProductF64` variants. - `try_create_accumulator` dispatches `AggExpr::Product(expr)` with the same numeric dtype widening as Sum. - `can_inline_agg` accepts Product on the same numeric dtype set as Sum. - 5 new tests + 1 helper. ## Behavior - Queries with `Product` over Int8/16/32/64/UInt8/16/32/64/Float32/Float64 columns now take the inline path instead of falling back to `make_groups + eval_agg_expression`. - Output values identical to the fallback path (verified by inline-vs-fallback tests). - All other agg types and dispatch paths are unchanged. ## Test Plan - `cargo test -p daft-recordbatch --release inline_agg` ā 37 passed (32 pre-existing + 5 new Product). - `cargo fmt -p daft-recordbatch --check` ā clean. - `cargo clippy -p daft-recordbatch --release --features python` ā clean, no `#[allow]`s added. New test cases: - `test_inline_product_matches_fallback` ā Utf8 keys + Int64 vals (exercises null-value branch). - `test_inline_int_key_product_matches_fallback` ā Int64 keys + Int64 vals (FNV int-key fast path). - `test_inline_int_key_with_nulls_product_matches_fallback` ā null keys. - `test_inline_product_float_matches_fallback` ā Float64 vals (finite values chosen to avoid overflow ambiguity). - `test_inline_all_null_vals_product_matches_fallback` ā all-null vals (exercises empty `Option<T>` finalize path). ## Related Issues - Part of #6585 (item 7).
main
7 minutes ago
feat(iceberg): support SQL read_iceberg ignore_corrupt_files (#7130) ## Changes Made Adds `ignore_corrupt_files` support for SQL `read_iceberg` by passing the option through the Rust scan builder into the existing Python Iceberg scan operator. Includes SQL coverage for skipping a corrupt Iceberg data file. ## Related Issues Closes #7129
main
1 hour ago
fix(sql): allow read_parquet file options (#7128) ## Changes Made Allowed SQL read_parquet to accept named path, file_path_column, and hive_partitioning arguments, matching the existing Python API. ## Related Issues N/A
main
1 hour ago
lerobot with formatting changes
slade/lerobot
8 hours ago
test(iceberg): strengthen corrupt SQL read assertion
jackylee-ch:codex-sql-read-iceberg-ignore-corrupt-files
9 hours ago
fix(sql): allow read_parquet file options
jackylee-ch:codex-sql-read-parquet-file-options
11 hours ago
test(session): cover namespace narrowing and multi-catalog without pattern
YuangGao:fix/session-list-tables-4400
13 hours ago
test(flotilla): reuse builder's min_cpu_per_task in with_resource_request MockTaskBuilder.with_resource_request was constructing a fresh DaftExecutionConfig::default() to fetch the min_cpu_per_task fallback, which silently reset any non-default value already set on the builder. Reuse self.resource_request.min_cpu_per_task instead so the fallback threads through chained .with_* calls correctly. Test-helper-only; no production behavior change. Addresses greptile P2 review comment.
XiaoHongbo-Hope:fix/min-cpu-per-task-wiring
14 hours ago
Latest Branches
CodSpeed Performance Gauge
0%
feat(lerobot): Add `daft.datasets.lerobot` for working with LeRobot v3 datasets
#7090
9 hours ago
fae1584
slade/lerobot
CodSpeed Performance Gauge
0%
feat(iceberg): support SQL read_iceberg ignore_corrupt_files
#7130
10 hours ago
8f1959c
jackylee-ch:codex-sql-read-iceberg-ignore-corrupt-files
CodSpeed Performance Gauge
0%
fix(sql): allow read_parquet file options
#7128
12 hours ago
b0047c5
jackylee-ch:codex-sql-read-parquet-file-options
Ā© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs