Latest Results
feat(functions): add UUIDv7 timestamp-extraction partition transforms (#7032)
Adds extract_minute_uuid7 / extract_hour_uuid7 / extract_day_uuid7 /
extract_month_uuid7 — partition transforms that decode the 48-bit
Unix-ms timestamp embedded in a UUIDv7's first 6 bytes and bucket it,
mirroring the Iceberg-style partition_hours / partition_days /
partition_months transforms.
- Input: Uuid or FixedSizeBinary(16) (128 bits); output: Int64.
- minute/hour/day are counts since the Unix epoch (floor division);
month is calendar months since 1970-01 ((year-1970)*12 + month-1),
matching partition_months. Version/variant bits are ignored — only the
leading 48 timestamp bits are read.
- Implemented as ScalarUDFs in daft-functions (auto-registered,
SQL-callable), exposed via daft.functions.extract_*_uuid7.
Tests: per-unit correctness against known instants (FixedSizeBinary and
Uuid inputs), epoch, calendar-month boundaries, nulls, version/variant
independence, type validation, and SQL parity.
> Note: this PR is now independent of the clustering-spec work — it was
rebased off the clustering stack onto `main`. The end-to-end
"shuffle-free over a custom DataSource clustered by extract_hour_uuid7"
test lives with the clustering PR, since it depends on
`DataSource.get_clustering_spec()` / `ClusteringSpec`.
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Latest Branches
0%
0%
colin/flight-shuffle-coordinator-memory 0%
fix/ray-worker-startup-timeout-config © 2026 CodSpeed Technology