Latest Results
feat(temporal): add Spark-style timezone conversions (#6919)
## Summary
Implements three more functions from issue #3798 by adding Spark-style
`from_utc_timestamp`, `to_utc_timestamp`, and `convert_timezone` as
native Daft temporal expressions.
This PR adds two new scalar UDFs in the temporal module for UTC↔local
conversions that return tz-naive timestamps (matching Spark semantics),
plus a `convert_timezone` alias over the existing `convert_time_zone`
that reverses the argument order to match Spark. Python and SQL surfaces
are both wired.
## Why
The issue asks for parity with PySpark's temporal functions. This PR
focuses on:
- UTC → local wall-clock conversion (`from_utc_timestamp`) producing
tz-naive output.
- Local wall-clock → UTC conversion (`to_utc_timestamp`) producing
tz-naive output.
- A Spark-style `convert_timezone(target_tz, source_ts)` alias that
matches Spark's argument order.
## Changes Made
- Add `FromUtcTimestamp` and `ToUtcTimestamp` scalar UDFs in
`src/daft-functions-temporal/src/time.rs`:
- Both reuse `daft_schema::time_unit` helpers (`parse_timezone`,
`timestamp_to_naive_local`, `naive_local_to_timestamp`,
`naive_datetime_to_timestamp`).
- Output dtype is always `Timestamp(unit, None)` regardless of input tz
label.
- Add `daft-schema` as a direct dependency in
`src/daft-functions-temporal/Cargo.toml` so the helpers are available to
the UDFs.
- Change `mod time` to `pub mod time` in
`src/daft-functions-temporal/src/lib.rs` so the SQL crate can register
handlers.
- Register `FromUtcTimestamp` and `ToUtcTimestamp` in
`TemporalFunctions`.
- Add SQL handlers `SQLFromUtcTimestamp`, `SQLToUtcTimestamp`, and
`SQLConvertTimezone` in `src/daft-sql/src/modules/temporal.rs`. The
convert_timezone handler delegates to the existing `ConvertTimeZone` UDF
with Spark's reversed argument order.
- Add Python wrappers `from_utc_timestamp`, `to_utc_timestamp`, and
`convert_timezone` in `daft/functions/datetime.py` and export them from
`daft/functions/__init__.py`.
- Add focused tests in `tests/dataframe/test_temporals.py`:
- `from_utc_timestamp` coverage: named tz (Europe/London BST), fixed
offset (+05:30), tz-aware input.
- `to_utc_timestamp` coverage: named tz.
- Round-trip identity for non-DST instants.
- Null propagation, invalid timezone error path.
- SQL integration for both UTC conversions.
- `convert_timezone` Spark-style alias.
## Behavior
- `from_utc_timestamp('2017-07-14 02:40:00', 'Europe/London')` returns
`2017-07-14 03:40:00` (BST is UTC+1 in July).
- `to_utc_timestamp('2017-07-14 03:40:00', 'Europe/London')` returns
`2017-07-14 02:40:00`.
- Both functions always return a tz-naive `Timestamp(unit, None)`
matching Spark.
- `from_utc_timestamp` treats the i64 as a UTC instant regardless of any
tz label on the input; `to_utc_timestamp` extracts the wall-clock using
the input's own tz label (or treats naive as UTC), then re-interprets
that wall-clock in the supplied tz.
- `convert_timezone(target_tz, source_ts)` is equivalent to
`convert_time_zone(source_ts, target_tz)` and requires the source to be
tz-aware (no `from_timezone` argument).
- Invalid timezone strings (e.g. `"Not/A/Zone"`) error at planning time
with a clear message.
- Null in the input row propagates to null in the output.
## Test Plan
- `cargo check -p daft-functions-temporal -p daft-sql`
- `make build`
- `DAFT_RUNNER=native pytest -q tests/dataframe/test_temporals.py -k
"utc_timestamp or convert_timezone or round_trip"`
## Related Issues
- Part of #3798
---------
Co-authored-by: Varun Madan <varun.madan@gmail.com> Latest Branches
0%
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files 0%
0%
© 2026 CodSpeed Technology