Latest Results
[Feat] Day-0 support for GPT-5.5 and GPT-5.5 Pro (#26449)
* feat(openai): day-0 support for GPT-5.5 and GPT-5.5 Pro
Add pricing + capability entries for the new GPT-5.5 family launched by
OpenAI on 2026-04-24:
- gpt-5.5 / gpt-5.5-2026-04-23 (chat): $5/$30/$0.50 per 1M
input/output/cached input
- gpt-5.5-pro / gpt-5.5-pro-2026-04-23 (responses-only): $60/$360/$6
per 1M input/output/cached input
Other fees (long-context >272k, flex, batches, priority, cache
discounts) follow the same ratios as GPT-5.4, with context window
retained at 1.05M input / 128K output.
No transformation / classifier code changes are required:
OpenAIGPT5Config.is_model_gpt_5_4_plus_model() already matches 5.5+ via
numeric version parsing, and model registration is driven from the
JSON. The existing responses-API bridge for tools + reasoning_effort
(litellm/main.py:970) already covers gpt-5.5-pro.
Tests:
- GPT5_MODELS regression list now covers gpt-5.5-pro and dated variants
- New test_generic_cost_per_token_gpt55_pro cost-calc test
- Updated test_generic_cost_per_token_gpt55 for long-context fields
* fix(openai): mirror reasoning_effort flags onto gpt-5.5 dated variants
gpt-5.5-2026-04-23 and gpt-5.5-pro-2026-04-23 were missing the
supports_none_reasoning_effort, supports_xhigh_reasoning_effort, and
supports_minimal_reasoning_effort flags that their non-dated
counterparts define. Reasoning-effort routing in OpenAIGPT5Config is
fully capability-driven from these JSON flags — since an absent flag
is treated as False for opt-in levels (xhigh), users pinning to a
dated snapshot would silently lose xhigh support and diverge from the
base alias on logprobs + flexible temperature handling.
Copy the flags onto both dated variants so every dated snapshot
inherits the base model's reasoning-effort capability profile.
Adds a parametrized regression test that asserts
supports_{none,minimal,xhigh}_reasoning_effort parity between each
dated variant and its non-dated counterpart, preventing future drift
when new snapshots are added. fix(adapters,vertex): pass output_config through to backends that accept it
Resolves the silent strip of Anthropic Structured Outputs across the
Vertex AI Claude transformation paths and the Anthropic-adapter
re-merge. Consolidates and supersedes four stalled community PRs
addressing overlapping aspects of the same root bug:
- #23475 (Vertex AI Claude blanket-strip removal)
- #23396 (Vertex AI Claude conditional passthrough)
- #23706 (Anthropic adapter exclude output_config from non-Anthropic
backends)
- #22727 (Anthropic adapter strip output_config for non-Anthropic
backends)
Closes / addresses: #23380 (Vertex AI Claude output_config drop),
related: #26423, #25079, #24549, #25971, #25957, #26163, #24856.
What was broken
---------------
* Vertex AI Claude paths called ``data.pop("output_config")`` and
``data.pop("output_format")`` unconditionally even when Vertex
accepted those fields. Callers asking for Structured Outputs got a
200 with prose and never knew the schema constraints had been
silently dropped (often masked for months by permissive fallback
parsers).
* The ``/v1/messages`` -> ``/chat/completions`` adapter
(``LiteLLMMessagesToCompletionTransformationHandler``) re-merged the
raw Anthropic-shaped ``output_config`` into ``completion_kwargs``
AFTER the translator already mapped its meaningful parts to
``response_format`` / ``reasoning_effort``. Non-Anthropic backends
(Azure OpenAI, Fireworks, Bedrock Nova, etc.) then 400'd with
"Extra inputs are not permitted".
Approach
--------
Vertex AI Claude (chat-completion + experimental_pass_through paths):
Replace the unconditional pop with a sanitizer
``_sanitize_vertex_anthropic_output_params`` that strips only the
Vertex-unsupported keys (today: ``effort``) from ``output_config``
while forwarding ``format`` and the legacy top-level
``output_format``. Defensive: non-dict ``output_config`` values are
dropped to avoid sending malformed payloads downstream.
Greptile P1 from PR #23396 addressed: when ``output_config`` carries
both ``format`` and ``effort``, the prior conditional pass-through
forwarded ``effort`` and reproduced the 400. The new helper filters
per-key.
Anthropic ``/v1/messages`` adapter:
Add ``output_config`` to a named module-level constant
``ANTHROPIC_ONLY_REQUEST_KEYS`` and wire it into ``excluded_keys`` so
the post-translation re-merge skips re-adding the raw key. This
fixes the 400 on non-Anthropic backends and avoids the conflicting
duplicate (``response_format`` + raw ``output_config``) on
Anthropic-family backends.
Greptile P2 from PR #23706 addressed: the constant gives reviewers
one grep target instead of an inline literal that silently grows.
Greptile P2 from PR #22727 addressed: ``extra_kwargs or {}`` is
replaced with explicit ``is None`` checks so empty-dict callers no
longer skip the fallback path.
Tests
-----
* tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/anthropic/
test_vertex_ai_partner_models_anthropic_transformation.py:
- 5 new/updated cases plus a direct unit test for
``_sanitize_vertex_anthropic_output_params``.
- Updated ``test_vertex_ai_claude_sonnet_4_5_structured_output_fix``
so its mock-injected ``output_format`` is asserted to FLOW THROUGH
(the original test asserted the now-buggy strip behavior).
* tests/test_litellm/llms/anthropic/experimental_pass_through/
adapters/test_handler_output_config_passthrough.py (new):
- Constant export sanity, output_config strip with ``effort`` only,
output_config strip with ``format`` only, regression guard that
unrelated extras still flow, explicit-empty-dict path, and the
``extra_kwargs=None`` no-crash path.
Test-quality fixes incorporated from Greptile review on the
superseded PRs:
* No ``inspect.getsource`` source-text assertions (PR #24114 / #23475).
* ``sys.path`` insertion is anchored to ``__file__`` (PR #23706).
* Assertion messages are positional, not tuple (PR #24114-class bug).
* No ``or {}`` masking explicit empty dicts in helper signatures
(PR #22727).
Verified locally: 26/26 pass with this commit. The new tests
fail (or fail to import) on ``main`` without it.
Out of scope
------------
* The ``max_tokens`` capping logic from PR #22727 — independent
concern, deserves its own PR with a focused test plan.
* Architectural rework of the ``excluded_keys`` mechanism (Greptile
P2 on PR #23706 noted point-fix growth). The named constant gives
maintainers a clear place to extend; a registry-based approach
would be a follow-up.
Co-Authored-By: netbrah <netbrah>
Co-Authored-By: s-zx <s-zx>
Co-Authored-By: invoicepulse <invoicepulse>
Co-Authored-By: cfdude <cfdude>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>dkindlund:fix/output-config-passthrough-consolidated Latest Branches
N/A
0%
litellm_hotfix_gpt-5.5-support 0%
Anai-Guo:fix/anthropic-temperature-reasoning-models © 2026 CodSpeed Technology