langchain-ai
langchain
BlogDocsChangelog

feat: parallelize sync generate method for improved LLM throughput

#34043
Comparing
ambershen:optimize/llm-sync-generate-parallelization
(
e85d221
) with
master
(
525d5c0
)
CodSpeed Performance Gauge
-24%
Regressions
1
Untouched
12
Skipped
21

Benchmarks

Skipped (21)

test_create_chat_prompt_init_time
libs/partners/prompty/tests/unit_tests/test_standard.py
Skipped
311.9 µs*
test_exa_retriever_init_time
libs/partners/exa/tests/unit_tests/test_standard.py
Skipped
325.3 µs*
test_qdrant_vectorstore_init_time
libs/partners/qdrant/tests/unit_tests/test_standard.py
Skipped
224.2 ms*
test_chroma_init_time
libs/partners/chroma/tests/unit_tests/test_standard.py
Skipped
57.2 ms*
test_init_time
libs/partners/deepseek/tests/unit_tests/test_chat_models.py::TestChatDeepSeekUnit
Skipped
1.6 s*
test_init_time
libs/partners/perplexity/tests/unit_tests/test_chat_models_standard.py::TestPerplexityStandard
Skipped
837.5 ms*
test_init_time
libs/partners/ollama/tests/unit_tests/test_chat_models.py::TestChatOllama
Skipped
1.6 s*
test_init_time
libs/partners/xai/tests/unit_tests/test_chat_models_standard.py::TestXAIStandard
Skipped
3.3 s*
test_init_time
libs/partners/fireworks/tests/unit_tests/test_standard.py::TestFireworksStandard
Skipped
6.6 s*
test_init_time
libs/partners/mistralai/tests/unit_tests/test_standard.py::TestMistralStandard
Skipped
9.1 ms*
test_nomic_embeddings_init_time
libs/partners/nomic/tests/unit_tests/test_standard.py
Skipped
1.5 ms*
test_init_time
libs/partners/groq/tests/unit_tests/test_standard.py::TestGroqStandard
Skipped
1.6 s*
test_stream_time
libs/partners/openai/tests/integration_tests/chat_models/test_responses_standard.py::TestOpenAIResponses
Skipped
857.3 ms*
test_init_time
libs/partners/openai/tests/unit_tests/chat_models/test_responses_standard.py::TestOpenAIResponses
Skipped
12.2 ms*
test_stream_time
libs/partners/openai/tests/integration_tests/chat_models/test_responses_standard.py::TestOpenAIStandard
Skipped
1.2 s*
test_init_time
libs/partners/openai/tests/unit_tests/chat_models/test_base_standard.py::TestOpenAIStandard
Skipped
12.1 ms*
test_stream_time
libs/partners/openai/tests/integration_tests/chat_models/test_base_standard.py::TestOpenAIStandard
Skipped
1.2 s*
test_init_time
libs/partners/openai/tests/unit_tests/chat_models/test_azure_standard.py::TestOpenAIStandard
Skipped
1.7 s*
test_init_time
libs/partners/anthropic/tests/unit_tests/test_standard.py::TestAnthropicStandard
Skipped
763.3 µs*
test_stream_time
libs/partners/anthropic/tests/integration_tests/test_standard.py::TestAnthropicStandard
Skipped
34.7 ms*
test_init_time_with_client
libs/partners/anthropic/tests/unit_tests/test_standard.py
Skipped
2.2 ms*

Failed

test_async_callbacks_in_sync
libs/core/tests/benchmarks/test_async_callbacks.py
Regression
CodSpeed Performance Gauge
-24%
18.4 ms24.3 ms

Passed

test_import_time[Document]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-5%
174.9 ms183.6 ms
test_import_time[InMemoryVectorStore]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-5%
559.9 ms589.4 ms
test_import_time[RunnableLambda]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-5%
447.5 ms471.9 ms
test_import_time[InMemoryRateLimiter]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
160.3 ms170.4 ms
test_import_time[Runnable]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
444.2 ms472.7 ms
test_import_time[ChatPromptTemplate]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
534.9 ms570 ms
test_import_time[LangChainTracer]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
395.2 ms421.3 ms
test_import_time[BaseChatModel]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
468.8 ms500.9 ms
test_import_time[CallbackManager]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-8%
407.1 ms442.9 ms
test_import_time[HumanMessage]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-9%
236.9 ms260.4 ms
test_import_time[PydanticOutputParser]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-9%
465.9 ms512.2 ms
test_import_time[tool]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-9%
451.6 ms497.3 ms

Commits

Click on a commit to change the comparison range
Base
master
525d5c0
-24.16%
feat: parallelize sync generate method for improved LLM throughput - Replace sequential loop with thread-pool executor mapping for multi-input processing - Preserve ordering, callback behavior, and error propagation - Add fast path for single input to avoid unnecessary overhead - Use get_executor_for_config context manager for proper resource management This optimization improves throughput when processing multiple prompts without breaking existing functionality or changing the API.
e85d221
2 days ago
by ambershen
© 2025 CodSpeed Technology
Home Terms Privacy Docs