Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

perf(file): add buffer_size to File.open() to reduce wasteful pre-reads (#6876) ## Changes Made DaftFile::load()` hardcoded a 16MB BufReader buffer for all calls, so metadata reads (needing ~1KB) and MIME sniffing (needing 16 bytes) each prefetched 16MB. Added a `buffer_size` parameter through the full Rust→Python chain, letting each call site specify an appropriate size — internal callers now use 4KB for sniffing and 64KB for metadata. Fully backward-compatible — `None` preserves existing 16MB default. <!-- Describe what changes were made and why. Include implementation details if necessary. --> ## Related Issues <!-- Link to related GitHub issues, e.g., "Closes #123" -->
main
11 minutes ago
test: restore fixed-size list parquet roundtrip
TuodiAunty:test/fixed-size-list-parquet-roundtrip
4 hours ago
test: restore fixed-size list parquet roundtrip
TuodiAunty:test/fixed-size-list-parquet-roundtrip
5 hours ago
feat(dashboard): per-task progress updates from flotilla workers (#6838) ## Summary Long-running distributed tasks (e.g. PhysicalScan reading millions of rows) currently produce no dashboard signal until task end. This PR adds mid-execution per-task progress updates from flotilla workers, kept separate from the coordinator-aggregated operator-stats roll-up so the two paths don't double-count. <img width="507" height="202" alt="image" src="https://github.com/user-attachments/assets/ffa30c12-c93f-434e-83f7-9c8a30d03720" /> - New \`Event::TaskStatsUpdate\` carrying a batched \`Vec<TaskStatsSnapshot>\` per worker per tick; emitted at 1Hz on flotilla workers, gated on \`DAFT_TASK_EVENTS_ENABLED\`. - New dashboard endpoint \`/engine/query/{query_id}/tasks/stats\` writes scalar totals onto \`TaskInfo\` (separate from \`OperatorInfo.stats\`). - Retention and the top-K running view rank by busy time (\`cpu_us\`) instead of wall-clock duration; \`cpu_us\` and \`total_cpu_us\` are now refreshed mid-flight rather than only at task end. - Propagates \`DAFT_DASHBOARD_URL\` and \`DAFT_TASK_EVENTS_ENABLED\` to \`RaySwordfishActor\` via runtime_env so the worker's \`DashboardSubscriber\` can POST. Per-operator breakdown is intentionally **not** carried in the new event — local NodeID doesn't cleanly map to either a distributed plan node id or a stable within-task position. Rows/bytes are also omitted for now (naive sums across local nodes double-count intermediate transfers in fused pipelines, same reason \`TaskEnd\` already omits them). Both can be added later if needed. ## Test plan - [x] Run a distributed query with \`DAFT_TASK_EVENTS_ENABLED=true\` and \`DAFT_DASHBOARD_URL\` pointed at a running dashboard; confirm the running tasks panel updates cpu_us live for in-flight tasks - [x] Confirm running tasks panel sorts by busy time and uses cpu_us column (not wall-clock duration) - [x] Confirm group totals (\`total_cpu_us\`) advance during execution rather than only on task end - [ ] Run with \`DAFT_TASK_EVENTS_ENABLED\` unset; confirm no task POSTs go out and dashboard behavior matches main 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main
17 hours ago
paimon read fallback when has deletion vectors or blob file
gavin9402:fix_paimon_deletion_vector_read
17 hours ago
feat: rename to forward fill
euan/optimize-asof
18 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
test: restore fixed-size list parquet roundtrip#6895
6 hours ago
7ea0509
TuodiAunty:test/fixed-size-list-parquet-roundtrip
CodSpeed Performance Gauge
-14%
feat: Add map_keys function to extract keys from Map type columns#6875
4 days ago
d1201fb
qingfeng-occ:map_keys
CodSpeed Performance Gauge
0%
8 days ago
0cd32f3
gavin9402:fix_paimon_deletion_vector_read
© 2026 CodSpeed Technology
Home Terms Privacy Docs