Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
3 minutes ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
1 hour ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
3 hours ago
implemented
slade/file-exists
8 hours ago
perf(inline-agg): add BoolAnd and BoolOr accumulator types (#6984) ## Summary Implements BoolAnd and BoolOr accumulators from #6585 (item 7) for the inline grouped aggregation path. Each accumulator holds a per-group `Option<bool>` state; first non-null value seeds the state, subsequent non-null values combine via `&&` (BoolAnd) or `||` (BoolOr). Output dtype is Boolean. Grouping semantics and final query results are unchanged. ## Why `AggExpr::BoolAnd` and `AggExpr::BoolOr` already exist in the DSL and are wired in the fallback path (`src/daft-recordbatch/src/lib.rs` → `Series::bool_and(groups)` / `Series::bool_or(groups)`), but currently fall back to `make_groups + eval_agg_expression` even when the rest of the query qualifies for the inline path. Adding them to the inline accumulator framework completes inline coverage of the standard reducer-style aggregates (Count / Sum / Min / Max / Product / BoolAnd / BoolOr) that all share the same `Vec<Option<T>>` per-group state shape. ## Changes Made - `src/daft-recordbatch/src/ops/inline_agg.rs`: - New `define_bool_and_accum!` and `define_bool_or_accum!` macros (kept separate per the Sum/Product precedent — these are semantically distinct ops with different identity and absorbing elements). - `define_agg_accumulator_enum!` extended with `BoolAnd` and `BoolOr` variants. - `try_create_accumulator` dispatches `AggExpr::BoolAnd(expr)` and `AggExpr::BoolOr(expr)` on `DataType::Boolean`. - `can_inline_agg` adds a separate Boolean-only dtype arm; existing numeric arm for Sum/Min/Max/Product is unchanged. - 5 new tests + 4 helpers. **Implementation note:** `BooleanArray::values()` doesn't expose a `&[bool]` slice because Arrow stores bools bit-packed. The null-free tight loop uses `self.source.to_bitmap()` + `bitmap.value(row_idx)` instead of the `.zip(values().iter())` pattern Sum/Product use over primitive slices. Functionally equivalent, just a different access pattern forced by the storage layout. ## Behavior - Queries with `BoolAnd` / `BoolOr` over Boolean columns now take the inline path instead of falling back to `make_groups + eval_agg_expression`. - Output values identical to the fallback path (verified by inline-vs-fallback tests). - All other agg types and dispatch paths are unchanged. - **Not implemented (deferred):** short-circuit optimization (stop scanning a group once BoolAnd hits `false` / BoolOr hits `true`). Adding a per-row branch to the hot loop would regress non-short-circuiting groups; Sum/Min/Max have analogous opportunities and intentionally don't take them. Revisit if benchmarks show it matters. ## Test Plan - `cargo test -p daft-recordbatch --release inline_agg` — 37 passed (32 pre-existing + 5 new). - `cargo fmt -p daft-recordbatch --check` — clean. - `cargo clippy -p daft-recordbatch --release --features python` — clean, no `#[allow]`s added. New test cases: - `test_inline_bool_and_matches_fallback` — Utf8 keys + Boolean vals (no-null tight loop). - `test_inline_bool_or_matches_fallback` — Utf8 keys + Boolean vals (no-null tight loop, OR semantics). - `test_inline_int_key_bool_and_matches_fallback` — Int64 keys + Boolean vals (FNV int-key fast path). - `test_inline_bool_and_with_nulls_matches_fallback` — Boolean vals with `None` interspersed (exercises null-value branch). - `test_inline_all_null_bool_or_matches_fallback` — all-null vals (exercises empty `Option<bool>` finalize path). ## Related Issues - Part of #6585 (item 7).
main
11 hours ago
add docs
slade/droid
11 hours ago
ci: retrigger flaky HF TLS handshake timeout
BABTUNA:perf/inline-agg-bool
13 hours ago
feat(sql): support read_parquet ignore_corrupt_files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
16 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat(sql): support read_parquet ignore_corrupt_files#7133
4 hours ago
6667566
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
CodSpeed Performance Gauge
0%
9 hours ago
3441ae0
slade/file-exists
CodSpeed Performance Gauge
0%
12 hours ago
c4eccd0
slade/droid
© 2026 CodSpeed Technology
Home Terms Privacy Docs