Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

fix(filesystem): fix pyarrow fs memory by caching by value, not identity (#7025) ## Changes Made Fixes a memory leak in pyarrow fs. In long-running `write_iceberg` jobs this drained file descriptors and threads until the process OOM'd. This was triggered with a refresh-credentials S3 setup, but the cache is broken for every IOConfig. This PR keys the cache on `repr(io_config)` The audit found that `IOConfig.__hash__` returns equal values for semantically-equal configs, but `__eq__` is identity-based on the PyO3 wrapper. The dict-keyed cache at `daft/filesystem.py:35` therefore missed on **every** call when the Rust side handed a fresh Python wrapper to each writer, rebuilding a new PyArrow `S3FileSystem` (with its own thread pool and connection pool) per output file. | | FD slope / iter | RSS slope MiB / iter | |---|---|---| | Before fix | +63.9 | +2.15 | | After fix | **−0.05** | +0.39 | ## Related Issues - N/A
main
10 hours ago
cleaned up while statement
euan/window-firstlastval-agg
16 hours ago
test(iceberg): cover partition field transform conversion
jackylee-ch:codex-test-iceberg-partition-field-transforms
19 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat(checkpoint): distributed observability counters#7026
11 hours ago
7a447e2
rohit/feature/checkpoint-metrics
CodSpeed Performance Gauge
0%
fix(filesystem): fix pyarrow fs memory by caching by value, not identity#7025
11 hours ago
c3d4776
rchowell/write-leak-fix
CodSpeed Performance Gauge
0%
17 hours ago
f6c00c4
euan/window-firstlastval-agg
© 2026 CodSpeed Technology
Home Terms Privacy Docs