Eventual-Inc
Daft
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
ci: revert TEMP Windows rust-tests enablement on PRs Windows verification on this PR is complete (run 25080590564): rust-tests-platform (Windows, false) passed in 40m. Restore the matrix exclude so PR runs don't pay the 50+ min Windows tax going forward — Windows continues to gate `main` only.
rohit/bugfix/test-local-full-ls-windows-uri
29 minutes ago
fix: ensure paimon is installed for integration tests
rchowell/paimon-integ
38 minutes ago
style: drop unused MicroPartition import in limit tests Leftover from an earlier draft of the test that constructed partitions inline; the helper now does it. Trips clippy under `--all-features` with `-D warnings`. https://claude.ai/code/session_019UkmN6JibhPKmMEqXrDSdk
sam/parquet-early-stats
1 hour ago
feat: genericize RandomShuffle to support Flight shuffle backend (#6808) ## Changes Made Genericize RandomShuffleNode to support both Ray and Flight shuffle backends, following the same ShuffleBackend pattern established in #6751 and #6764. Replaces hardcoded ShuffleBackend::Ray with configurable backend dispatch, and uses build_refs_task_builder() for the read side so both backends work correctly. ## Related Issues Part of Phase 2 shuffle genericization (#6472). <!-- Link to related GitHub issues, e.g., "Closes #123" -->
main
1 hour ago
fix(io): switch hand-rolled file:// strips to canonical helper Audit follow-up to #6817. Eight call sites across the workspace were hand-rolling `uri.strip_prefix("file://")` or `trim_start_matches`, which leaves `/C:/...` on Windows-canonical inputs and feeds Windows file APIs paths they reject with os error 123 ("filename, directory name, or volume label syntax is incorrect"). Switch each to `daft_io::strip_file_uri_to_path`, which has the `#[cfg(windows)] strip_leading_slash_before_drive` branch and already covers POSIX and macOS unchanged. Sites fixed: - daft-json/src/local.rs (read_json_local) - daft-csv/src/local.rs (stream_csv_local) - daft-writers/src/utils.rs (build_local_file_path) - daft-scan/src/glob.rs (file_path_column trimming) - daft-functions-uri/src/upload.rs (instantiate_and_trim_path) - daft-parquet/src/read.rs (two sites in read_parquet) daft-compression/src/compression.rs is left as-is — it strips the prefix only for extension detection, where the leading "/" is harmless. A doc comment is added clarifying that. Caught by Windows rust-tests on PR #6824 (with the temp matrix-enable): the daft-json read_local tests were panicking with os error 123 once iter_dir started emitting canonical URIs.
rohit/bugfix/test-local-full-ls-windows-uri
1 hour ago
add from impl for enums
chris/taskevent-metadata
2 hours ago
test(udf): cover combined large-stderr + no-newline use_process case Follow-up to #6793. The two existing tests in test_use_process_deadlocks.py cover each deadlock pathology in isolation: - Large stderr WITH newline (mp.connection.wait drain). - Small stderr WITHOUT newline (prepended-newline divider). Per a review concern, even with the drain in place readline() in trace_output() could in principle still block on a no-newline buffer while the child is stalled on a full OS pipe. Add a regression test that exercises both pathologies together — a UDF that writes >64 KiB to stderr without ever emitting a newline. Confirms the current implementation handles the combined case correctly.
rohit/test/use-process-large-stderr-no-newline
2 hours ago
fix(udf): resolve use_process=True subprocess deadlocks (#6793) ## Summary Fixes two distinct deadlocks in `UdfHandle`'s stdout plumbing (see #6762 for full root-cause analysis): 1. **Pipe-buffer overflow**: when a `use_process=True` UDF writes more than the OS pipe buffer (~64 KiB on Linux/macOS) to stderr per batch, the child blocks in `write()` before reaching `conn.send(_SUCCESS)` while the parent blocks in `recv()` without draining the pipe — classic two-channel ordering deadlock. Fixed by using `multiprocessing.connection.wait()` to multiplex between the control channel and the stdout pipe, draining lines into a per-handle buffer while waiting for the response. 2. **Divider-merge**: when a UDF's last stderr write lacks a trailing newline, those bytes and `_OUTPUT_DIVIDER` fuse into a single `readline()` result that never matches the exact-equality check in `trace_output()`, hanging the parent on an empty pipe forever. Fixed by prepending `\n` to the divider write so it always lands on its own line. Closes #6762. ## Commit structure (TDD) Three commits on the branch, each verified locally: 1. **`test(udf): add failing regression tests`** — introduces `pytest-timeout` dev dep and two integration tests that reproduce each deadlock. Tests hang and time out at this commit (60s total runtime). 2. **`fix(udf): prepend newline to output divider`** — one-line fix in `udf_worker.py`. Divider-merge test passes; pipe-buffer test still hangs. 3. **`fix(udf): drain stdout concurrently with recv`** — `mp.connection.wait`-based drain in `udf.py`. Both tests pass (~0.3s). ## Test plan - [x] `pytest tests/udf/test_use_process_deadlocks.py` passes (0.4s) on macOS, native runner - [x] `pytest tests/expressions/test_legacy_udf.py` — 165 passed, 0 regressions - [x] `pytest tests/udf/test_row_wise_udf.py -k use_process` — existing async use_process test still passes - [ ] CI verification ## Notes - This subsystem is POSIX-only in practice: `UdfHandle.__init__` uses a Unix-socket `Listener` with a `tempfile.NamedTemporaryFile` path that won't work on Windows. `mp.connection.wait()` is stdlib-blessed for this "wait on Connection + pipe fd concurrently" pattern on POSIX. A follow-up could add an explicit `NotImplementedError` for Windows users. - At commit #2 of this branch, running the pipe-buffer test first causes `pytest-timeout`'s SIGALRM to leave a dangling subprocess that affects the next test in the same pytest session. Cosmetic — HEAD has both fixes so the pathology can't reproduce. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main
2 hours ago
Latest Branches
CodSpeed Performance Gauge
0%
fix(test): update test_local_full_ls to expect canonical file:// URIs
#6824
58 minutes ago
bdd66bb
rohit/bugfix/test-local-full-ls-windows-uri
CodSpeed Performance Gauge
0%
fix: ensure paimon is installed for integration tests
#6827
1 hour ago
e99fbd9
rchowell/paimon-integ
CodSpeed Performance Gauge
-10%
feat: use scan task metadata to parallelize initial limit reads
#6763
2 hours ago
ed74beb
sam/parquet-early-stats
© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs