Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

test(lance): drop misleading pragma: no cover on exercised count_rows The count_rows() body in the broken-fragment fixture is actually executed during the test: the call raises and the except branch in distribute_fragments_balanced catches it, so coverage.py reports the line as covered. The 'pragma: no cover' annotation is therefore misleading. Drop it. Addresses Greptile review feedback.
XuQianJin-Stars:fix/lance-utils-redundant-num-groups
22 minutes ago
fix(flotilla): drive LimitNode cancellation from actor's contributor set `StreamingSinkOutput::Finished` from `DistributedLimitSink` terminated the entire streaming-sink node (see base.rs:326), killing the cached pipeline on a flotilla worker. Other SwordfishTasks sharing the cached pipeline as separate input_ids died with "Plan execution task has died", silently truncating queries like `df.filter(...).limit(N).count_rows()` and `df.limit(N).into_batches(M)`. Polling `actor.is_done()` after each notify also conflated "limit reached" with "safe to cancel". Cancelling the moment the actor reports done could kill an in-flight contributor whose data wasn't yet materialized, losing limit rows. Rework: - Sink always returns `NeedMoreInput`. The cached pipeline stays alive across input_ids; per-input streams drain naturally via finalize. - Actor exposes `wait_for_contributors()`: awaits `is_done`, then returns the input_ids that consumed budget (`take > 0`). - Notify token payload changes from `usize` (row count) to `TaskID`, so the LimitNode learns *which* tasks completed. - LimitNode awaits `wait_for_contributors` and the notify_tokens; it only cancels `parent_cancel` once every contributing input_id has appeared in `completed_ids`. The scheduler's is_cancelled filter then drops pending tasks; in-flight tasks that get killed are non-contributors (their data is 0), so cancellation never loses limit rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
colin/distributed-limit-actor
5 hours ago
fix(json): address greptile review feedback Three review comments from greptile-apps on PR: 1. P1 json_tuple: reject duplicate field names. Calling json_tuple(expr, "a", "a") used to silently produce a Struct with two "a" fields, which caused .get("a") to return only the first match and broke serialization. We now validate uniqueness in both extract_keys_from_* paths and raise a clear ValueError at planning time. 2. P2 json_object_keys: document that key order is alphabetical. serde_json's Value::Object is backed by BTreeMap (preserve_order is not enabled), so keys come back sorted. Spark preserves insertion order. The Rust doc-comment, the SQL docstring, and the Python wrapper docstring now all explicitly call out this difference. 3. P2 json_tuple: propagate row-level nullability to the struct. Previously the result Struct was constructed with nulls=None, so every row was struct-level non-null even when the input was NULL / malformed / not an object; df["t"].is_null() always returned False on those rows. We now build a per-row validity NullBuffer and pass it to StructArray::new so is_null() correctly reflects bad inputs. Missing-key cases keep producing only field-level NULLs (the row stays valid). Tests: - New test_json_tuple_rejects_duplicate_field_names. - test_json_tuple_invalid_and_null extended to assert is_null() == True on the bad rows and to read back the struct itself. - All 17 json tests + 36 doctests pass.
XuQianJin-Stars:feat/json-functions-array-length-object-keys-tuple
8 hours ago
fix
feat/asof-benchmarks
11 hours ago

Latest Branches

CodSpeed Performance Gauge
-1%
fix(lance): drop dead floor-division of num_groups in distribute_fragments_balanced#6946
47 minutes ago
f562930
XuQianJin-Stars:fix/lance-utils-redundant-num-groups
CodSpeed Performance Gauge
-1%
5 hours ago
84cc897
colin/distributed-limit-actor
CodSpeed Performance Gauge
-1%
8 hours ago
7b1d121
XuQianJin-Stars:feat/json-functions-array-length-object-keys-tuple
© 2026 CodSpeed Technology
Home Terms Privacy Docs