Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

fix: use uuid crate for uuidv7 generation
everettVT/uuidv7-arrow-kernel
42 minutes ago
style: apply ruff format and cargo fmt
BABTUNA:feat/temporal-add-months
17 hours ago
fix: Handle dtype mismatch error in join_asof join keys (#6904) Currently as_of joins with mismatched by keys would fail with `mismatch dtype error`. The fix is to 1. normalize and cast the keys to a shared supertype (e..g. int64 and float64 are normalized to float64), which is the same methodology used for the on_key, as well as for the join keys of equality joins. 2. remove the computation of right_cols_to_drop in the local executor, because it does not drop the casted expressions computed during normalization, e.g. Cast(Column("left_on_key"), Utf8), and led to duplicate columns produced in the output (the "left_on_key" column was duplicated in the result). Since we already computed the desired output schema in the logical plan, we can simply use this as the basis to prune columns during execution. ``` left = {"ts": [1, 3, 5], "v": [...]} # ts is Int64 right = {"ts": [2.0, 4.0], "w": [...]} # ts is Float64 # correct output {"ts": [1, 3, 5 ], "v": [10, 30, 50], "w": [None, 20, 40]} # without the fix: the second "ts" silently overwrites the first: {"ts": [None, 2.0, 4.0], "v": [10, 30, 50], "w": [None, 20, 40]} # ^^^ left ts [1, 3, 5] is gone — no error raised ``` **a more in-depth explanation for the second bug:** 1. This bug requires three conditions to trigger: - a join key (meaning it should have been a candidate for right_cols_to_drop) - that shares a name with the other side (explained later) - mismatched types on that key (causing normalization to wrap it in a Cast expression, that prevents it from being caught in right_cols_to_drop). 2. Here’s how the bug occurs: 1. At the logical plan layer, right_cols_to_drop is computed from bare unresolved column expressions — before any normalization has occurred. It is then passed to deduplicate_asof_join_columns, which uses it to determine which right-side columns need to be renamed with a right. prefix. Since the join keys are inside the dropped cols set, the deduplication step skips it since it’s already being dropped. 2. After translation, at the physical plan layer, AsofJoinOperator::new ignored the output_schema and recomputed right_cols_to_drop from scratch — but by this point, translation had already wrapped bare Column("g") expressions in Cast(Column("g"), Utf8). The extract_name closure used in the recomputation only handled bare column expressions, so it returned None for any cast-wrapped expression, silently omitting the right key column from right_cols_to_drop. 4. Without it in the drop set, prune_right_batch kept the column, producing a record batch with duplicate column names. When to_pydict() built a Python dict, the duplicate key caused the right-side values to silently overwrite the left-side values, corrupting the output.
main
18 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat: add uuidv7 generation#6909
1 hour ago
138009b
everettVT/uuidv7-arrow-kernel
CodSpeed Performance Gauge
0%
11 hours ago
9c6bea1
BABTUNA:feat/temporal-unix-extractors
CodSpeed Performance Gauge
-1%
11 hours ago
9339879
BABTUNA:feat/temporal-tz-conversions
© 2026 CodSpeed Technology
Home Terms Privacy Docs