Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
1 hour ago
feat(temporal): add Spark-style timezone conversions (#6919) ## Summary Implements three more functions from issue #3798 by adding Spark-style `from_utc_timestamp`, `to_utc_timestamp`, and `convert_timezone` as native Daft temporal expressions. This PR adds two new scalar UDFs in the temporal module for UTC↔local conversions that return tz-naive timestamps (matching Spark semantics), plus a `convert_timezone` alias over the existing `convert_time_zone` that reverses the argument order to match Spark. Python and SQL surfaces are both wired. ## Why The issue asks for parity with PySpark's temporal functions. This PR focuses on: - UTC → local wall-clock conversion (`from_utc_timestamp`) producing tz-naive output. - Local wall-clock → UTC conversion (`to_utc_timestamp`) producing tz-naive output. - A Spark-style `convert_timezone(target_tz, source_ts)` alias that matches Spark's argument order. ## Changes Made - Add `FromUtcTimestamp` and `ToUtcTimestamp` scalar UDFs in `src/daft-functions-temporal/src/time.rs`: - Both reuse `daft_schema::time_unit` helpers (`parse_timezone`, `timestamp_to_naive_local`, `naive_local_to_timestamp`, `naive_datetime_to_timestamp`). - Output dtype is always `Timestamp(unit, None)` regardless of input tz label. - Add `daft-schema` as a direct dependency in `src/daft-functions-temporal/Cargo.toml` so the helpers are available to the UDFs. - Change `mod time` to `pub mod time` in `src/daft-functions-temporal/src/lib.rs` so the SQL crate can register handlers. - Register `FromUtcTimestamp` and `ToUtcTimestamp` in `TemporalFunctions`. - Add SQL handlers `SQLFromUtcTimestamp`, `SQLToUtcTimestamp`, and `SQLConvertTimezone` in `src/daft-sql/src/modules/temporal.rs`. The convert_timezone handler delegates to the existing `ConvertTimeZone` UDF with Spark's reversed argument order. - Add Python wrappers `from_utc_timestamp`, `to_utc_timestamp`, and `convert_timezone` in `daft/functions/datetime.py` and export them from `daft/functions/__init__.py`. - Add focused tests in `tests/dataframe/test_temporals.py`: - `from_utc_timestamp` coverage: named tz (Europe/London BST), fixed offset (+05:30), tz-aware input. - `to_utc_timestamp` coverage: named tz. - Round-trip identity for non-DST instants. - Null propagation, invalid timezone error path. - SQL integration for both UTC conversions. - `convert_timezone` Spark-style alias. ## Behavior - `from_utc_timestamp('2017-07-14 02:40:00', 'Europe/London')` returns `2017-07-14 03:40:00` (BST is UTC+1 in July). - `to_utc_timestamp('2017-07-14 03:40:00', 'Europe/London')` returns `2017-07-14 02:40:00`. - Both functions always return a tz-naive `Timestamp(unit, None)` matching Spark. - `from_utc_timestamp` treats the i64 as a UTC instant regardless of any tz label on the input; `to_utc_timestamp` extracts the wall-clock using the input's own tz label (or treats naive as UTC), then re-interprets that wall-clock in the supplied tz. - `convert_timezone(target_tz, source_ts)` is equivalent to `convert_time_zone(source_ts, target_tz)` and requires the source to be tz-aware (no `from_timezone` argument). - Invalid timezone strings (e.g. `"Not/A/Zone"`) error at planning time with a clear message. - Null in the input row propagates to null in the output. ## Test Plan - `cargo check -p daft-functions-temporal -p daft-sql` - `make build` - `DAFT_RUNNER=native pytest -q tests/dataframe/test_temporals.py -k "utc_timestamp or convert_timezone or round_trip"` ## Related Issues - Part of #3798 --------- Co-authored-by: Varun Madan <varun.madan@gmail.com>
main
2 hours ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
4 hours ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
9 hours ago
Merge branch 'main' into issue-2423
Lucas61000:issue-2423
10 hours ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
10 hours ago
Merge remote-tracking branch 'upstream/main' into codex-sql-read-parquet-ignore-corrupt-files
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
12 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat(sql): support read_parquet ignore_corrupt_files#7133
12 hours ago
67a6171
jackylee-ch:codex-sql-read-parquet-ignore-corrupt-files
CodSpeed Performance Gauge
0%
12 hours ago
e17161d
Lucas61000:issue-6901
CodSpeed Performance Gauge
0%
11 hours ago
eed3823
Lucas61000:issue-2423
© 2026 CodSpeed Technology
Home Terms Privacy Docs