Avatar for the langchain-ai user
langchain-ai
langchain
BlogDocsChangelog

Performance History

Latest Results

Merge branch 'master' into mmk/store_cached_generation
keenborder786:mmk/store_cached_generation
5 hours ago
Merge branch 'master' into mmk/reasoning_strip
keenborder786:mmk/reasoning_strip
5 hours ago
Merge branch 'master' into mmk/bound_parameters
keenborder786:mmk/bound_parameters
5 hours ago
Merge branch 'master' into mmk/over_load_variant_middlewares
keenborder786:mmk/over_load_variant_middlewares
5 hours ago
fix(langchain): preserve structured pii redaction in state hooks
Alexxigang:fix/pii-state-hook-redaction
2 days ago
fix(ci): close shell-injection vector in middleware evals workflow Addresses the Corridor security review on this PR. GitHub textually expands `${{ inputs.* }}` / `${{ matrix.* }}` expressions before the shell runs, so splicing those expressions into the `run:` script body lets a value containing `'` break out of the string literal and execute arbitrary commands in a job that has every provider API key and `LANGSMITH_API_KEY` in scope. The fix is the canonical mitigation: - Every value derived from `inputs.*` or `matrix.*` is now passed via `env:` instead of spliced into the script source. GitHub fills env vars at job start; the values never appear in the script text and bash treats them as data via `"$VAR"`. - The script body invokes `pytest` directly rather than going through `make evals`. The Makefile target's `$(MODEL)` / `$(PYTEST_EXTRA)` are Make's textual expansion (the second issue Corridor flagged) — bypassing `make` from the workflow keeps user-controlled values out of the Make layer entirely. The Makefile target itself is unchanged and remains the supported local-run path; running it locally is safe because the operator controls their own inputs. - `pytest` args are built as a bash array (`PYTEST_ARGS+=(...)`) so word-splitting is handled by bash, not by the script source. - `set -euo pipefail` so any earlier failure halts the job before the pytest invocation. Realistic exposure of the pre-fix code: workflow_dispatch requires repository write access, so the realistic attacker was a malicious insider or a compromised maintainer account, not an external actor. Mitigating anyway because secrets-in-scope is the wrong default.
nh/todo-middleware-loop-contract
3 days ago
fix(langchain): emit final answer after the final `write_todos` call in `TodoListMiddleware` Appends a short "Finishing a task" section to `WRITE_TODOS_SYSTEM_PROMPT` that tells the model to call `write_todos(completed)` in one turn and then deliver its substantive final answer in the next turn, after the tool result is returned. The existing prompt content is preserved unchanged. Why: with Anthropic models (Sonnet 4.6 in particular), the model tends to put a substantive final answer in the same turn as the final `write_todos` call that marks every todo `completed`. The agent loop then forces one more model turn after the tool result, and that loop-terminating message ends up being either empty or a content-free recap ("All tasks complete! ✅"). Downstream consumers that read the last `AIMessage` to extract the agent's answer see the wrap-up, not the real answer. The new section is structural guidance, not Anthropic-specific: it restores the natural agent-loop contract that the last message is the substantive answer. Validated under the eval suite added in this PR: Sonnet 4.6, baseline tier (n=1 per task): test_density_rank_lands_in_final_message PASS test_population_compare_lands_in_final_message PASS test_rank_with_unknown_lookup_lands_in_final_message PASS test_trivial_plan_skips_write_todos PASS Sonnet 4.6, hillclimb tier (n=1 per task): test_design_api_lands_in_final_message PASS test_density_cairo_lands_in_final_message PASS Under the prior prompt, `density-rank` and `population-compare` failed 0/8 trials at n=8 (the final `AIMessage` was a 21-character "All tasks complete!" wrap with the substantive ranking living in the previous turn). With this change the failure mode does not reproduce. GPT-5 is unaffected: it does not exhibit the wasted-turn pattern under either prompt. The added guidance is structural, so it is a no-op for models that already place their final answer in the loop-terminating message. AI-agent involvement is disclosed below. --- Co-developed with Claude (Anthropic).
nh/todo-middleware-loop-contract
3 days ago

Latest Branches

CodSpeed Performance Gauge
+1%
chore(core): Normalization of `total_cost` for `prompt` key in `llm_cache`#35312
5 hours ago
a01e1bc
keenborder786:mmk/store_cached_generation
CodSpeed Performance Gauge
+1%
5 hours ago
1ad3061
keenborder786:mmk/reasoning_strip
CodSpeed Performance Gauge
+1%
5 hours ago
b2dc160
keenborder786:mmk/bound_parameters
© 2026 CodSpeed Technology
Home Terms Privacy Docs