Eventual-Inc
Daft
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
fix(iceberg): pass ignore_corrupt_files through table reads
jackylee-ch:codex-iceberg-table-ignore-corrupt-files
5 hours ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
12 hours ago
fix type hint and style.
refactor--embed_text-public-api-to-delegate-expression-building-to-providers
18 hours ago
perf(grouped-agg): bump NUM_SHARDS_PER_MORSEL to 8 (empirically tuned)
BABTUNA:perf/sharded-grouped-agg
1 day ago
fix(flotilla): honor explicit num_cpus=0 in autoscaler bundles aggregate_ray_bundles forced CPU >= 1 (via .max(1) on individual bundles and a fixed CPU:1 on every GPU bundle), so a task that explicitly sets num_cpus=0 still requested a CPU — breaking "explicit num_cpus passes through unchanged" and over-requesting CPU for GPU-only / memory-only workloads. Drop the .max(1); give GPU bundles CPU only when the packed tasks actually need it (gpu_cpu_sum > 0); and omit the CPU key from the Ray bundle dict when it is zero. Add a test for num_cpus=0 GPU-only and memory-only tasks.
XiaoHongbo-Hope:fix/min-cpu-per-task-wiring
1 day ago
fix(flotilla): emit unit GPU autoscaler bundles, not oversized shapes Carrying GPU tasks' CPU as ceil(gpu_cpu_sum / gpu_bundles) produced bundles like {CPU:2, GPU:1}. As a single Ray request_resources shape that fits no standard 1-CPU/1-GPU node, so the autoscaler can't scale up — and the value is recorded as the high-water mark, stalling further attempts. Emit unit {CPU:1, GPU:1} bundles instead (a sub-GPU task's cpu and gpu are each <= 1, so one always fits a standard GPU node), with the count covering both dimensions: ceil(max(gpu_sum, gpu_cpu_sum)). Two 1-CPU/0.5-GPU tasks now request two {CPU:1,GPU:1} shapes (2 CPU / 2 GPU) rather than one unschedulable {CPU:2}. Assert the schedulable shape in the regression test.
XiaoHongbo-Hope:fix/min-cpu-per-task-wiring
1 day ago
fix(flotilla): track post-aggregation request as autoscaler high-water mark The high-water mark recorded the fractional cpu_sum, but the request actually sent to Ray is the integer-aggregated bundle total. With min_cpu_per_task=0.1 the mark grew ~0.1 per cycle while ceil() only bumped the real CPU request every ~10 cycles, so scale-up for many pending tasks stalled for ~1/min_cpu_per_task cycles (≈50s at the default 5s interval) per extra CPU. Record the aggregated integer bundle totals (what Ray actually receives) as the mark instead. Because each cycle selects bundles until the fractional cpu_sum exceeds the integer mark, ceil() now bumps by at least one CPU every cycle, restoring the intended one-unit-per-cycle ramp while still never requesting less than before. Convergence is unchanged: once pending demand can no longer exceed the mark, the cycle is skipped. Verified: cargo test -p daft-distributed --lib (8 task tests pass), cargo check/clippy -p daft-distributed --features python clean.
XiaoHongbo-Hope:fix/min-cpu-per-task-wiring
2 days ago
fix(flotilla): keep multi-CPU tasks as individual autoscaler bundles aggregate_ray_bundles packed every CPU-only task into unit {"CPU": 1} bundles. That is wrong for a task requesting num_cpus >= 1: a 4-CPU task runs on one worker, so splitting it into 4 spread bundles lets the autoscaler provision 4 single-CPU nodes and leaves the task unschedulable. It also turned CPU magnitude into the loop count, so a huge or non-finite explicit num_cpus (inf as i64 == i64::MAX) could hang/OOM, and a NaN poisoned the running sum and zeroed the batch's CPU request. Only pack sub-1.0 CPU-only tasks now; tasks with GPU, memory, or num_cpus >= 1 keep an individual bundle (CPU rounded up to at least 1). Non-finite / non-positive CPU contributes nothing. The packed sum is now bounded by task count, so the loop can no longer blow up.
XiaoHongbo-Hope:fix/min-cpu-per-task-wiring
2 days ago
Latest Branches
CodSpeed Performance Gauge
0%
fix(iceberg): pass ignore_corrupt_files through table reads
#7147
8 hours ago
e4ce31d
jackylee-ch:codex-iceberg-table-ignore-corrupt-files
CodSpeed Performance Gauge
0%
feat(sql): support read_parquet ignore_corrupt_files
#7133
5 days ago
c19a075
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
CodSpeed Performance Gauge
0%
refactor(ai): delegate embed_text expression building to providers
#6026
5 months ago
1749308
refactor--embed_text-public-api-to-delegate-expression-building-to-providers
© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs