Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

feat(gcs): implement delete for GCS object store ## Summary - Implements `delete` method for the GCS object store backend (`GCSSource`) - Fixes `write_parquet(write_mode='overwrite')` on GCS paths (closes #6928) ## Changes - Adds `UnableToDeleteFile` error variant to the GCS error enum - Implements `delete` on `GCSClientWrapper` using `google_cloud_storage`'s `delete_object` API - Handles 404/410 responses as success (idempotent delete per `ObjectSource` trait contract) - Overrides `delete` in `ObjectSource` impl for `GCSSource` ## Test plan - Verify `write_parquet(write_mode="overwrite")` succeeds on a `gs://` path - Verify delete is idempotent (overwriting a non-existent path does not fail) - `make build` passes - `DAFT_RUNNER=native make test EXTRA_ARGS="-v tests/io/"` passes Author: daiping8 <dai.ping88@zte.com.cn>
daiping8:fix/gcs-delete-support
5 hours ago
refactor: parameterize uuid generation
everettVT/uuid-param-generator
9 hours ago
perf(shuffle): Write one shuffle file per task instead of N partition files (#6948) ## Changes Made Reduce write-side CPU and scheduling overhead on the Flight shuffle path: - Write one file per map task with offsets for each partition, instead of one file per partition. In theory this should significantly speed up writes (and nvme reads), and also avoid the max ulimit problem. - Write the final file upon repartition finalize instead of incrementally, this allows us to write larger and fewer ipc messages. the ray repartition path also just accumulates at the end anyway. - Repartition incrementally in sink based on a byte threshold, to amortize tiny repartitions. - Chunk Flight server responses at ~4 MiB instead of emitting one `FlightData` per source batch, amortizing the reader's per-batch flatbuffer parse + array construction. ## Benchmarks Wall-clock seconds, lower is better. `daft pypi flt` is daft 0.7.13 with the Flight backend; `daft built flt` is this PR. `mr` = map-reduce shuffle, `psm` = pre-shuffle merge. | scale | parts | daft pypi mr | daft pypi psm | daft pypi flt | daft built flt | Δ vs pypi flt | |---|---:|---:|---:|---:|---:|---:| | sf100_top8 | 32 | 15.84 | 13.52 | 12.33 | 13.70 | +11% | | sf100 | 32 | 19.75 | 23.94 | 19.76 | **15.80** | **−20%** | | sf100 | 256 | 24.59 | 22.56 | 23.17 | 21.99 | −5% | | sf1000_top64 | 128 | 19.25 | 20.26 | 20.89 | **18.35** | −12% | | sf1000 | 256 | 380.13 | 276.90 | 106.83 | 107.59 | ~0% | | sf1000 | 512 | — | — | 140.53 | **110.07** | **−22%** | | sf1000 | 1024 | — | — | 211.68 | **140.70** | **−34%** | | sf10000_top1000 | 256 | 260.41 | 394.09 | 120.96 | **99.70** | **−18%** | | sf10000 | 1024 | — | — | 2556.05 | **1849.32** | **−28%** | | sf10000 | 2048 | — | — | 4529.25 | **3024.56** | **−33%** | --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main
10 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat(gcs): implement delete for GCS object store#6958
14 hours ago
b7e5386
daiping8:fix/gcs-delete-support
CodSpeed Performance Gauge
0%
CodSpeed Performance Gauge
0%
9 hours ago
0848830
everettVT/uuid-param-generator
© 2026 CodSpeed Technology
Home Terms Privacy Docs