langchain-ai
langchain
BlogDocsChangelog

feat: parallelize sync generate method for improved LLM throughput

#34043
Comparing
ambershen:optimize/llm-sync-generate-parallelization
(
e85d221
) with
master
(
525d5c0
)
CodSpeed Performance Gauge
-24%
Regressions
1
Untouched
12
Skipped
21

Benchmarks

Skipped (21)

Failed

test_async_callbacks_in_sync
libs/core/tests/benchmarks/test_async_callbacks.py
Regression
CodSpeed Performance Gauge
-24%
18.4 ms24.3 ms

Passed

test_import_time[Document]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-5%
174.9 ms183.6 ms
test_import_time[InMemoryVectorStore]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-5%
559.9 ms589.4 ms
test_import_time[RunnableLambda]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-5%
447.5 ms471.9 ms
test_import_time[InMemoryRateLimiter]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
160.3 ms170.4 ms
test_import_time[Runnable]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
444.2 ms472.7 ms
test_import_time[ChatPromptTemplate]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
534.9 ms570 ms
test_import_time[LangChainTracer]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
395.2 ms421.3 ms
test_import_time[BaseChatModel]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-6%
468.8 ms500.9 ms
test_import_time[CallbackManager]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-8%
407.1 ms442.9 ms
test_import_time[HumanMessage]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-9%
236.9 ms260.4 ms
test_import_time[PydanticOutputParser]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-9%
465.9 ms512.2 ms
test_import_time[tool]
libs/core/tests/benchmarks/test_imports.py
CodSpeed Performance Gauge
-9%
451.6 ms497.3 ms

Commits

Click on a commit to change the comparison range
Base
master
525d5c0
-24.16%
feat: parallelize sync generate method for improved LLM throughput - Replace sequential loop with thread-pool executor mapping for multi-input processing - Preserve ordering, callback behavior, and error propagation - Add fast path for single input to avoid unnecessary overhead - Use get_executor_for_config context manager for proper resource management This optimization improves throughput when processing multiple prompts without breaking existing functionality or changing the API.
e85d221
3 hours ago
by ambershen
© 2025 CodSpeed Technology
Home Terms Privacy Docs