Eventual-Inc
Daft
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
Merge branch 'main' into feat/symbolize-string-groupby-only
BABTUNA:feat/symbolize-string-groupby-only
35 minutes ago
test(pycapsule): use PyArrow 15.0.0-compatible APIs CI runs unit tests against pyarrow==15.0.0 (matrix lower bound). Earlier tests used APIs only available in newer pyarrow: - pa.chunked_array(obj) as PyCapsule consumer (added post-15) - pa.record_batch(dict, schema=...) dict-form constructor Replaced with pa.RecordBatch.from_pydict / from_arrays and exercised requested_schema cast via pa.RecordBatchReader.from_stream(obj, schema=) which works across all supported pyarrow versions.
aaron-ang:arrow-pycapsule
4 hours ago
address comments
YuangGao:fix/iceberg-respect-target-file-size-3823
5 hours ago
feat(iceberg): honor table-level write.target-file-size-bytes property
YuangGao:fix/iceberg-respect-target-file-size-3823
8 hours ago
re-enable Iceberg map-type round-trip test
YuangGao:test/unskip-iceberg-roundtrip-2459
9 hours ago
feat: add conv function for PySpark parity
YuangGao:feat/add-conv-function
11 hours ago
feat: add Python UDAF support (@daft.udaf) (#6790) ## Changes Made Adds `@daft.udaf` decorator that lets users define custom aggregate functions in Python with a three-stage `aggregate`/`combine`/`finalize` pipeline. Supports single-state, multi-state, parameterized, and multi-input UDAFs, working with both `groupby().agg()` and global `agg()`. Rust side adds `PyAggFn` implementing the `AggFn` trait, bridging to Python via GIL. Python side adds the decorator API (`udaf.py`) and execution bridge (`agg_execution.py`). Includes 19 tests and user docs. --- During Python UDAF development, two supporting changes were made to daft-core. - `AggFn::name()` return type relaxed from `&'static str` to `&str`, removing the `Box::leak` workaround needed by runtime-named implementations. - Introduced `AggExpr::AggFnCombine`, a internal planner node that enables proper distributed two-phase aggregation staging, also replacing the ad-hoc `try_eager_combine` mechanism. ## Related Issues #6698
main
18 hours ago
refactor(distributed): decouple shuffle cleanup from Ray (#6809) ## Changes Made Flight shuffle writes data to local disk on each worker, and cleanup currently hardcodes a `crate::python::ray::clear_shuffle_dirs_on_all_nodes` call in `runner.rs` that uses `ray.remote` to fan out deletion. This PR decouples that by moving cleanup into a `cleanup_shuffle_dirs` method on the `WorkerManager` trait, so the runner no longer knows which communication channel is used. `RayWorkerManager` delegates to the same bridge function; future non-Ray backends (e.g. K8s) can implement their own. ## Related Issues This lays groundwork for Daft Native Kubernetes Support (#6639). <!-- Link to related GitHub issues, e.g., "Closes #123" -->
main
18 hours ago
Latest Branches
CodSpeed Performance Gauge
-1%
feat(inline-agg): symbolize string group keys in multi-column grouped aggregation
#6748
1 hour ago
971192b
BABTUNA:feat/symbolize-string-groupby-only
CodSpeed Performance Gauge
-1%
feat: Arrow PyCapsule Interface
#6745
19 days ago
ec4b272
aaron-ang:arrow-pycapsule
CodSpeed Performance Gauge
0%
feat(iceberg): honor table-level write.target-file-size-bytes property
#6912
6 hours ago
5c477ed
YuangGao:fix/iceberg-respect-target-file-size-3823
© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs