Avatar for the langchain-ai user
langchain-ai
langchain
BlogDocsChangelog

Performance History

Latest Results

fix(langchain): close three structural gaps in `PIIMiddleware` stream redaction Corridor flagged three real gaps: 1. **Stream transformer not installed for `apply_to_tool_results`-only configurations.** Users enabling tool-result scrubbing without `apply_to_output=True` received no streaming protection — raw PII on `tools` channel events and `ToolMessage` payloads on `messages` and `values` channels reached consumers before the state-level `before_model` enforcer ran. Install the transformer whenever either flag is set. 2. **Structured `BaseMessage.content` (list of content blocks) passed through unredacted.** The legacy `(BaseMessage, metadata)` path and `_redact_value`'s `BaseMessage` branch only handled `str` content. When `.content` is a list (the standard shape for v1 content-blocks), the entire list flowed through without inspection — text blocks with PII, reasoning blocks, and any other structured content leaked on the wire. Walk list-typed content through `_redact_value` so every string leaf inside any block is detected and redacted. 3. **Dictionary keys not redacted.** `_redact_value` recursed into dict values but copied keys unchanged. PII used as a map key (e.g., an email-indexed result map) streamed unchanged on the `tools` and `values` channels. Redact both keys and values; non-string keys pass through the identity branch unchanged. Adds regression tests covering each gap: - `test_transformer_installed_for_tool_results_only` - `test_redact_value_walks_structured_message_content` - `test_redact_value_redacts_dict_keys`
nh/pii-middleware-stream-redaction
25 minutes ago
feat(langchain): apply lookback withholding to cumulative tool-call args Corridor flagged a partial-exposure window on the streaming tool-call args path: the in-place redaction caught complete PII patterns, but intermediate cumulative states (`{"to": "alice@`, `{"to": "alice@example`, …) were forwarded unredacted before the final chunk triggered a match. Extend the same lookback semantics already used for `text-delta` and `reasoning-delta` to the cumulative tool-call args path: detect on the full cumulative args, then emit only `args[:len(args) - lookback]` on each chunk. The trailing `stream_lookback` characters are withheld — they might be the start of a partial PII match completing in a future cumulative delta. The withheld tail surfaces at `content-block- finish` where `_finalize_block` redacts the parsed args dict. For args ≤ `stream_lookback` (the typical case for tool calls), this withholds the entire args during streaming — the redacted dict appears only at finalize. For args > `stream_lookback`, the safe prefix streams incrementally as the cumulative state grows. Residual exposure note: PII appearing more than `stream_lookback` chars from the cumulative tail in a delta where the pattern hasn't yet completed can still surface in the emit prefix — same shape as "PII longer than `stream_lookback`" on the text path. The `content- block-finish` snapshot redaction remains the backstop. `block` strategy unconditionally emits empty args during streaming and lets `_finalize_block` empty the args dict on PII detection; `after_model` raises shortly after.
nh/pii-middleware-stream-redaction
15 hours ago
style(langchain): ruff format on `PIIMiddleware`
nh/pii-middleware-stream-redaction
16 hours ago
fix(langchain): close `block` strategy bypass on `PIIMiddleware` stream `PIIMiddleware(..., strategy="block", apply_to_output=True)` installed no stream transformer, so raw `content-block-delta` events flowed unfiltered to consumers iterating `astream_events` / `run.messages`. By the time `after_model` raised `PIIDetectionError`, the consumer had already seen the PII on the wire. The previous reasoning — that raising mid-stream would leave projections torn — was correct in principle but left the streaming surface unprotected. Install a buffering transformer for `block` instead. It withholds every delta from the consumer (empty `delta["text"]`) and runs detection once on the assembled block at `content-block-finish`. If PII is present the finalize content is zeroed too; `after_model` then raises on the original state message to terminate the run. If no PII, the finalize event carries the full text and the consumer sees the message all at once. No mid-stream raises, no leaked PII. For the Python-3.10 legacy `(BaseMessage, metadata)` event shape, returning `False` from `process()` only drops the event from the main log — `MessagesTransformer` still aprocesses it before that and projects the live AIMessage. Replacing the event's `data` tuple with an empty-content copy keeps the `run.messages` projection clean while leaving the original message in graph state for `after_model` to raise on. Adds: - a unit test that streams PII under `block` and asserts every delta and the finalize content are empty - a unit test that streams clean text under `block` and asserts the finalize content carries the full text - an end-to-end test under `create_agent` that confirms no PII characters reach the consumer and `PIIDetectionError` is raised
nh/pii-middleware-stream-redaction
16 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat(langchain): redact streamed PII in flight on `PIIMiddleware`#37616
29 minutes ago
2d75e0d
nh/pii-middleware-stream-redaction
CodSpeed Performance Gauge
0%
CodSpeed Performance Gauge
0%
chore(infra): bump `langchain-tests` floor to 1.1.9#37610
19 hours ago
d53385c
mdrxy/bump-standard-floor
© 2026 CodSpeed Technology
Home Terms Privacy Docs