Latest Results
fix(udf): honor per-call kwargs in udf v2
Fix row-wise/batch UDF v2 so that per-call keyword arguments (including Expression kwargs) are correctly honored and not incorrectly shared across call sites. Add a regression test that mirrors the reported `format_number` example using default, literal, and expression overrides.
The v2 UDF wrapper (`daft.udf.udf_v2.Func.__call__`) used a single `func_id` derived from the decorated function to identify all UDF expressions produced by that function. This `func_id` was passed through to the Rust `row_wise_udf` / `batch_udf` builders and ultimately into the logical plan as part of `RowWisePyFn` / batch UDF metadata.
Because all logical UDF nodes shared the same `func_id` regardless of their concrete arguments, they could be treated as the *same* expression by downstream components (e.g. optimizations, caching, or expression reuse keyed by this identifier). As a result, multiple calls like:
```python
@daft.func
def format_number(value: int, prefix: str = "$", suffix: str = "") -> str:
return f"{prefix}{value}{suffix}"
format_number(df["amount"])
format_number(df["amount"], prefix="€", suffix=" EUR")
format_number(df["amount"], suffix=df["amount"].cast(daft.DataType.string()))
```
could end up sharing underlying UDF state keyed only by `func_id`, so that overrides for `prefix` / `suffix` were not reliably respected per call site.
Introduce a per-call identifier in `Func.__call__` so that each logical UDF call site is uniquely identified, while still keeping the stable human-readable name for display:
- Add a monotonically increasing `_daft_call_seq` counter on `Func` instances.
- For each call that involves Expression arguments, derive a `call_id = f"{self.func_id}-{call_seq}"`.
- Pass `call_id` instead of `self.func_id` as the `func_id` argument when constructing the underlying `row_wise_udf` / `batch_udf` expressions (for generator, batch, and regular row-wise variants).
This keeps the original `name` used for plan display intact, but guarantees that each distinct call site (with its own bound `args`/`kwargs`) has a unique function identifier, preventing unintended sharing across calls.huleilei:fix-udf-kwargs-binding Active Branches
#59930%
#60840%
#60830%
© 2026 CodSpeed Technology