Eventual-Inc
Daft
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
feat(functions): add video_frames_from_bytes for row-level decoding from a binary column Adds `daft.functions.video_frames_from_bytes`, a row-level expression that decodes frames straight from a Binary column of encoded video bytes. Useful when the encoded video is already in memory (custom downloader UDFs, bytes streaming through other operators) and you don't want to round-trip through a path. Returns the same per-frame Struct schema as `video_frames`. Behavior notes: - Null inputs are mapped to an empty frame list rather than raised. This lets a sibling expression branch on the original null column (typically populating an `extract_error` column via `when(col.is_null(), ...)`) without aborting the whole batch when an upstream download UDF signalled failure with `None`. - The BytesIO buffer wrapping each row's bytes is explicitly closed after decoding. PyAV's `container.__exit__` does not close the underlying file-like; without an explicit close the raw bytes (often hundreds of MB per row) stay pinned until the next GC pass — costly inside long-running Ray actor processes. - Eager input validation matches `video_frames`: Pillow availability, width/height pairing, positive `sample_interval_seconds`. Refactor: the per-container iteration loop in `VideoFile.frames` is extracted into a module-level `_iter_frames_from_container` helper that both the file and bytes paths share. The return-dtype of the per-frame struct is also hoisted into a `_VIDEO_FRAMES_RETURN_DTYPE` module constant so both Funcs share a single source of truth. 5 new tests (bytes happy path, sample_interval interaction, resize, keyframe filtering, null-input → empty list) and a docs example in `docs/modalities/videos.md`.
TheR1sing3un:feat_video_frames_from_bytes
6 hours ago
Merge branch 'main' into feat_sample_interval
TheR1sing3un:feat_sample_interval
7 hours ago
fix: formatting
euanlimzx:euan/better_distributed_asof_alt
7 hours ago
fix(temporal): align trunc alias with Spark argument order Make trunc use Spark-style order (input, interval) in both Python and SQL aliases. Also move SQL temporal test imports to module scope to satisfy style checks and update alias tests accordingly.
BABTUNA:feat/temporal-alias-batch4
9 hours ago
fix(pmod): mirror Spark's conditional adjustment exactly
YuangGao:feat/add-pmod
9 hours ago
update remote uri check
gavin9402:introduce_file_resource
9 hours ago
feat(temporal): add spark-style datetime aliases Add Python and SQL temporal aliases (dayofmonth/dayofyear/weekofyear, date_format, trunc, dateadd, datediff, datepart) mapped to existing implementations for parity and discoverability. Includes targeted SQL and dataframe alias coverage.
BABTUNA:feat/temporal-alias-batch4
10 hours ago
introduce task resource
gavin9402:introduce_file_resource
12 hours ago
Latest Branches
CodSpeed Performance Gauge
0%
feat(functions): add video_frames_from_bytes for row-level decoding from a binary column
#6833
7 hours ago
92d5caf
TheR1sing3un:feat_video_frames_from_bytes
CodSpeed Performance Gauge
0%
feat(functions): add sample_interval_seconds to video_frames
#6832
7 hours ago
99dee72
TheR1sing3un:feat_sample_interval
CodSpeed Performance Gauge
0%
feat: distributed range repartitioned asof joins v2
#6816
8 hours ago
23cd9e0
euanlimzx:euan/better_distributed_asof_alt
© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs