Avatar for the BerriAI user
BerriAI
litellm
BlogDocsChangelog

Performance History

Latest Results

[Feat] Day-0 support for GPT-5.5 and GPT-5.5 Pro (#26449) * feat(openai): day-0 support for GPT-5.5 and GPT-5.5 Pro Add pricing + capability entries for the new GPT-5.5 family launched by OpenAI on 2026-04-24: - gpt-5.5 / gpt-5.5-2026-04-23 (chat): $5/$30/$0.50 per 1M input/output/cached input - gpt-5.5-pro / gpt-5.5-pro-2026-04-23 (responses-only): $60/$360/$6 per 1M input/output/cached input Other fees (long-context >272k, flex, batches, priority, cache discounts) follow the same ratios as GPT-5.4, with context window retained at 1.05M input / 128K output. No transformation / classifier code changes are required: OpenAIGPT5Config.is_model_gpt_5_4_plus_model() already matches 5.5+ via numeric version parsing, and model registration is driven from the JSON. The existing responses-API bridge for tools + reasoning_effort (litellm/main.py:970) already covers gpt-5.5-pro. Tests: - GPT5_MODELS regression list now covers gpt-5.5-pro and dated variants - New test_generic_cost_per_token_gpt55_pro cost-calc test - Updated test_generic_cost_per_token_gpt55 for long-context fields * fix(openai): mirror reasoning_effort flags onto gpt-5.5 dated variants gpt-5.5-2026-04-23 and gpt-5.5-pro-2026-04-23 were missing the supports_none_reasoning_effort, supports_xhigh_reasoning_effort, and supports_minimal_reasoning_effort flags that their non-dated counterparts define. Reasoning-effort routing in OpenAIGPT5Config is fully capability-driven from these JSON flags — since an absent flag is treated as False for opt-in levels (xhigh), users pinning to a dated snapshot would silently lose xhigh support and diverge from the base alias on logprobs + flexible temperature handling. Copy the flags onto both dated variants so every dated snapshot inherits the base model's reasoning-effort capability profile. Adds a parametrized regression test that asserts supports_{none,minimal,xhigh}_reasoning_effort parity between each dated variant and its non-dated counterpart, preventing future drift when new snapshots are added.
main
13 minutes ago
fix: address Greptile review feedback on PR #26439 Three concerns raised by bot reviewers, all addressed: 1. CodeQL cyclic-import warning ``experimental_pass_through/transformation.py`` imported from the parent ``..transformation`` module, which CodeQL flagged as a potential cycle. Extracted the helper into a new leaf module ``vertex_ai_partner_models/anthropic/output_params_utils.py`` that has no heavy imports of its own. Both transformation files now import from it cleanly. Renamed the helper from the underscore- prefixed ``_sanitize_vertex_anthropic_output_params`` to the public ``sanitize_vertex_anthropic_output_params`` since it is now shared across modules. 2. Greptile P2: redundant ``None`` guard on ``extra_kwargs`` ``handler.py`` had two ``extra_kwargs = extra_kwargs if ... else {}`` coercions; the second was a no-op because line 220 already coerced. Removed the second one and added a NOTE comment so future readers understand ``extra_kwargs`` is guaranteed non-None at the point of use. 3. Greptile P2: misleading "already translated" docstring The docstring claimed the translator above mapped ``output_config.format`` to ``response_format``, but Greptile correctly traced the code and found that only the legacy top-level ``output_format`` was being translated — ``output_config.format`` was being silently dropped on the adapter path. Two-part fix: a. Code: extended ``_translate_output_format_to_openai`` to accept both shapes (top-level ``output_format`` AND ``output_config.format`` sub-key). Top-level still takes precedence when both are supplied. This means callers using the newer Anthropic Structured Outputs API now have their schema properly forwarded to non-Anthropic backends as ``response_format``. b. Tests: rewrote the misleading docstring to describe what actually happens, plus added two new tests: * ``test_output_format_top_level_still_translates`` — regression guard for the legacy path * ``test_output_format_takes_precedence_over_output_config_format`` — documents the precedence rule explicitly Tests: 28/28 pass (was 26/26 before; +2 for the new translation behavior + precedence). All run in ~0.5s, no real network calls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dkindlund:fix/output-config-passthrough-consolidated
5 hours ago
fix(ui): add missing 'zai' (Z.AI / Zhipu AI) provider to Add-Model dropdown (#25482) The Z.AI (Zhipu AI) provider was missing from the Add-Model dropdown in the admin UI, even though the rest of the stack already supports it: - /public/providers returns 'zai' in the provider list - provider_endpoints_support.json includes a full 'zai' entry with endpoints and a docs URL (https://docs.litellm.ai/docs/providers/zai) - Backend routing works for zai/* models (e.g. zai/glm-4.5, zai/glm-5) - There are many zai/* entries in model_prices_and_context_window.json The dropdown is driven by the hard-coded Providers enum and provider_map in provider_info_helpers.tsx, which did not include 'zai', so users could not select Z.AI when adding a model through the UI. This PR: - Adds Providers.ZAI ('Z.AI (Zhipu AI)') to the enum. - Maps it to 'zai' in provider_map so the UI round-trips the existing backend provider key. - Wires a reasonable placeholder 'zai/glm-4.5' in getPlaceholder, since glm-4.5 is an established zai/* model in the pricing catalog. - Adds two regression tests in provider_info_helpers.test.tsx: 1. getProviderLogoAndName('zai') resolves to Providers.ZAI. 2. getPlaceholder(Providers.ZAI) returns 'zai/glm-4.5'. No logo asset is added in this PR; getProviderLogoAndName already gracefully returns an empty logo string for providers missing from providerLogoMap, matching the existing pattern for several other providers. A follow-up can add a dedicated logo. Fixes #25482
MackDing:fix-ui-zai-provider-dropdown
6 hours ago
fix(adapters,vertex): pass output_config through to backends that accept it Resolves the silent strip of Anthropic Structured Outputs across the Vertex AI Claude transformation paths and the Anthropic-adapter re-merge. Consolidates and supersedes four stalled community PRs addressing overlapping aspects of the same root bug: - #23475 (Vertex AI Claude blanket-strip removal) - #23396 (Vertex AI Claude conditional passthrough) - #23706 (Anthropic adapter exclude output_config from non-Anthropic backends) - #22727 (Anthropic adapter strip output_config for non-Anthropic backends) Closes / addresses: #23380 (Vertex AI Claude output_config drop), related: #26423, #25079, #24549, #25971, #25957, #26163, #24856. What was broken --------------- * Vertex AI Claude paths called ``data.pop("output_config")`` and ``data.pop("output_format")`` unconditionally even when Vertex accepted those fields. Callers asking for Structured Outputs got a 200 with prose and never knew the schema constraints had been silently dropped (often masked for months by permissive fallback parsers). * The ``/v1/messages`` -> ``/chat/completions`` adapter (``LiteLLMMessagesToCompletionTransformationHandler``) re-merged the raw Anthropic-shaped ``output_config`` into ``completion_kwargs`` AFTER the translator already mapped its meaningful parts to ``response_format`` / ``reasoning_effort``. Non-Anthropic backends (Azure OpenAI, Fireworks, Bedrock Nova, etc.) then 400'd with "Extra inputs are not permitted". Approach -------- Vertex AI Claude (chat-completion + experimental_pass_through paths): Replace the unconditional pop with a sanitizer ``_sanitize_vertex_anthropic_output_params`` that strips only the Vertex-unsupported keys (today: ``effort``) from ``output_config`` while forwarding ``format`` and the legacy top-level ``output_format``. Defensive: non-dict ``output_config`` values are dropped to avoid sending malformed payloads downstream. Greptile P1 from PR #23396 addressed: when ``output_config`` carries both ``format`` and ``effort``, the prior conditional pass-through forwarded ``effort`` and reproduced the 400. The new helper filters per-key. Anthropic ``/v1/messages`` adapter: Add ``output_config`` to a named module-level constant ``ANTHROPIC_ONLY_REQUEST_KEYS`` and wire it into ``excluded_keys`` so the post-translation re-merge skips re-adding the raw key. This fixes the 400 on non-Anthropic backends and avoids the conflicting duplicate (``response_format`` + raw ``output_config``) on Anthropic-family backends. Greptile P2 from PR #23706 addressed: the constant gives reviewers one grep target instead of an inline literal that silently grows. Greptile P2 from PR #22727 addressed: ``extra_kwargs or {}`` is replaced with explicit ``is None`` checks so empty-dict callers no longer skip the fallback path. Tests ----- * tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/anthropic/ test_vertex_ai_partner_models_anthropic_transformation.py: - 5 new/updated cases plus a direct unit test for ``_sanitize_vertex_anthropic_output_params``. - Updated ``test_vertex_ai_claude_sonnet_4_5_structured_output_fix`` so its mock-injected ``output_format`` is asserted to FLOW THROUGH (the original test asserted the now-buggy strip behavior). * tests/test_litellm/llms/anthropic/experimental_pass_through/ adapters/test_handler_output_config_passthrough.py (new): - Constant export sanity, output_config strip with ``effort`` only, output_config strip with ``format`` only, regression guard that unrelated extras still flow, explicit-empty-dict path, and the ``extra_kwargs=None`` no-crash path. Test-quality fixes incorporated from Greptile review on the superseded PRs: * No ``inspect.getsource`` source-text assertions (PR #24114 / #23475). * ``sys.path`` insertion is anchored to ``__file__`` (PR #23706). * Assertion messages are positional, not tuple (PR #24114-class bug). * No ``or {}`` masking explicit empty dicts in helper signatures (PR #22727). Verified locally: 26/26 pass with this commit. The new tests fail (or fail to import) on ``main`` without it. Out of scope ------------ * The ``max_tokens`` capping logic from PR #22727 — independent concern, deserves its own PR with a focused test plan. * Architectural rework of the ``excluded_keys`` mechanism (Greptile P2 on PR #23706 noted point-fix growth). The named constant gives maintainers a clear place to extend; a registry-based approach would be a follow-up. Co-Authored-By: netbrah <netbrah> Co-Authored-By: s-zx <s-zx> Co-Authored-By: invoicepulse <invoicepulse> Co-Authored-By: cfdude <cfdude> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dkindlund:fix/output-config-passthrough-consolidated
6 hours ago

Latest Branches

CodSpeed Performance Gauge
N/A
[Feat]: Add GradientAI tool calling support#22011
14 minutes ago
d21e90f
main
CodSpeed Performance Gauge
0%
[Feat] Day-0 support for GPT-5.5 and GPT-5.5 Pro#26449
1 hour ago
d2d676c
litellm_hotfix_gpt-5.5-support
CodSpeed Performance Gauge
0%
fix(anthropic): drop temperature from reasoning-family supported params#26445
2 hours ago
0cbe669
Anai-Guo:fix/anthropic-temperature-reasoning-models
© 2026 CodSpeed Technology
Home Terms Privacy Docs