Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

feat: add forward ASOF joins (#6918) ## Forward vs Backward ASOF Joins Forward ASOF joins are fundamentally the same as backward ASOF joins, with a few algorithmic differences: ### Local Execution - **Binary search:** Backward finds the ceiling left row (first `left.on_key >= right.on_key`). Forward finds the floor left row (last `left.on_key <= right.on_key`). - **Candidate selection:** Backward keeps the largest right `on_key` among candidates (more recent = closer). Forward keeps the smallest (sooner = closer). - **Fill:** Backward forward-fills unmatched rows (past event propagates to later rows). Forward backward-fills (future event propagates to earlier rows). ### Distributed Execution - **Carryover extraction:** Backward extracts the max right row per partition as the carryover (the most recent event that could match a future partition). Forward extracts the min (the earliest event that could match a preceding partition). - **Carryover partition propagation:** Backward forward-propagates — partition `i` inherits the carryover of partition `i-1` if it has none. Forward backward-propagates — partition `i` inherits from partition `i+1`. - **Per-task carryover assignment:** Backward prepends the carryover from the preceding partition to each task's right-side input. Forward appends the carryover from the following partition.
main
24 minutes ago
fix(docs): update data_sources.list -> list_sources in api reference
helmanofer:codex/register-datasource-api-gmail
8 hours ago
Merge branch 'main' into codex/register-datasource-api
helmanofer:codex/register-datasource-api
10 hours ago
Merge branch 'main' into feat/add-conv-function
YuangGao:feat/add-conv-function
15 hours ago
fix(sql): omit connection URL from read_sql error messages and __repr__ (#6933) ## Why A user reported credentials still leaking in `daft.read_sql` errors on v0.7.11 even after the original redactor in #6902. The URL shape from their stack trace: ``` trino://<USER>@<HOST>:<PORT>?auth=jwt&access_token=<SECRET>&http_scheme=https ``` There's no `user:password@host` here — the secret lives in `access_token=` as a query parameter. The previous fix only redacted userinfo passwords, so the JWT was emitted in plaintext in the `Failed to execute sql: ... from connection: ...` error. Separately, `urllib.parse.urlparse` silently mis-parses passwords containing `#` / `/` / `?` (it pushes them into the fragment field), so `parsed.password` came back `None` and the old `_redact_url` returned the URL unchanged for inputs like `trino://alice:p#ss@host/db`. I initially tried to plug both holes inside `_redact_url` (sqlalchemy `make_url` as the parser, plus query-parameter redaction by sensitive-name substring). Reviewers (rchowell) pushed back: *"secrets appear in many places around the connection string and are not limited to query params — we probably shouldn't log it"*. Agreed. Trying to enumerate every shape a credential can take in a connection URL is a losing game. ## What this PR actually does **Stop echoing the connection URL in error messages and in `SQLConnection.__repr__`.** That's it. - `RuntimeError(f\"Failed to execute sql: {sql} from connection: {self.conn}, error: {e}\")` → `RuntimeError(f\"Failed to execute sql: {sql}, error: {e}\")`. The caller knows which connection they passed in; the URL is redundant in the message. - `SQLConnection.__repr__` now returns `SQLConnection(dialect='trino', driver='')` instead of the URL. Removed alongside: `_redact_url`, `_SENSITIVE_PARAM_KEYWORDS`, `_is_sensitive_param`, the opportunistic SQLAlchemy `make_url` import, and the urllib.parse fallback with its `@`-in-authority heuristic — none of it is needed once the URL itself is not echoed. ## After the fix ``` Failed to execute sql: SELECT * FROM (SELECT * FROM iceberg.namespace.\"table\") AS subquery LIMIT 10, error: (trino.exceptions.TrinoConnectionError) failed to execute: HTTPSConnectionPool(...): ... [SQL: SELECT * FROM (SELECT * FROM iceberg.namespace.\"table\") AS subquery LIMIT 10] ``` No URL, no userinfo, no query parameters. ## Verified end-to-end Tested locally against the customer's exact URL shape plus two regressions: | Shape | Secret value | Present in error? | |---|---|---| | Trino JWT (`?access_token=<JWT>&auth=jwt`) | the JWT | ✅ No | | Userinfo password (`alice:hunter2@...`) | `hunter2` | ✅ No | | Password with `#` (`alice:p#ss@...`) | `p#ss` | ✅ No | ## Tests `tests/io/test_sql.py`: - `test_execute_sql_error_does_not_leak_credentials` — parametrized across the three leak shapes; asserts the secret is not in the raised `RuntimeError`. - `test_repr_does_not_leak_url` — asserts `repr(conn)` contains neither the secret, the host, nor the username. ## Related Follow-up to #6902 / issue #6903. Customer-reported continued leak after v0.7.11. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Varun Madan <varun@Mac.attlocal.net> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main
18 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat: register python data sources#6936
9 hours ago
fbaeef7
helmanofer:codex/register-datasource-api-gmail
CodSpeed Performance Gauge
-12%
feat: register python data sources#6935
18 days ago
803038d
helmanofer:codex/register-datasource-api-rebased
CodSpeed Performance Gauge
0%
10 hours ago
845c52d
helmanofer:codex/register-datasource-api
© 2026 CodSpeed Technology
Home Terms Privacy Docs