Eventual-Inc
Daft
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
chore(deps): drop pyarrow 8.0.0 support, bump minimum to >= 15.0.0 (#6378) ## Changes Made Bump minimum PyArrow version from `>= 8.0.0` to `>= 15.0.0` (44 files, -450 lines). - Update version constraints in `pyproject.toml` and CI test matrix - Remove `_FixSliceOffsets` workaround (pyarrow < 12.0.0 struct slice offset bug, fixed upstream) - Remove `pyarrow_supports_fixed_shape_tensor()` and all conditional branches - Remove obsolete version checks (`< 12.0.1`, `< 13.0.0`, `>= 9.0.0`) and `try/except` imports - Clean up ~50 `pytest.mark.skipif` markers and unused `PYARROW_GE_*` constants in test files Note: `_FixEmptyStructArrays` is intentionally kept — Daft internally cannot handle empty StructArrays, not just an arrow2 FFI issue. ## Related Issues Closes #6347
main
2 hours ago
fix(io): make JSON chunk_size byte-based and fix local byte-range reading - Change chunk_size semantic from row count to byte size - Fix stream_json() ignoring byte range for local files, which made scan_task_split_and_merge ineffective for local JSONL - Remove calculate_chunks_and_size() that silently overwrote user chunk_size - Clear stale metadata/statistics on byte-range sub-ScanTasks - Delete dead code: read_json_bulk, ChunkError, file-level allow(deprecated) - Guard JSON array detection to skip byte-range sub-slices - Add integration tests for chunk_size partitioning (native + Ray)
everySympathy:json-chunk-by-bytes
6 hours ago
delete file
PhysicsACE:geospatial
1 day ago
feat: support split csv file
linguoxuan:split_csv
3 days ago
refactor(udf): Inline is_complex_ray_options and update docstring
Jay-ju:feature/udf-ray-options
3 days ago
fix: also detect verbose option in has_custom_options() Address review feedback: has_custom_options() was missing the verbose check, so df.show(verbose=True) without explicit format would also fall through to the __repr__ path and ignore the option.
mango766:fix/show-max-width-without-format
3 days ago
chore: Make the partition threshold to use `pre_shuffle_merge` configurable Signed-off-by: plotor <zhenchao.wang@hotmail.com>
plotor:zhenchao-jungle
3 days ago
fix(scheduler): fix autoscaler underscaling after Ray upgrade Two independent bugs caused the autoscaler to scale to fewer workers than demanded: 1. Ratio underestimation: needs_autoscaling() runs after schedule_tasks() drains pending tasks, so the demand/capacity ratio only reflected residual demand. Track last_scheduled_count so the ratio uses total demand (pending + just-dispatched). 2. Monotonic watermark: max_resources_requested in worker_manager never reset, permanently blocking re-scaling after the first peak. Reset the watermark when workers join or die so the autoscaler can re-evaluate demand against the new topology. Also strip zero-valued GPU/memory keys from Ray resource bundles and add tracing to both decision points. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
desmond/fix-autoscaler-underscaling
3 days ago
Active Branches
fix: json chunk by bytes
last run
6 hours ago
#6374
CodSpeed Performance Gauge
0%
feat: Basic geospatial support
last run
1 day ago
#6392
CodSpeed Performance Gauge
0%
feat: support split csv file
last run
3 days ago
#6370
CodSpeed Performance Gauge
0%
© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs