LCEL (LangChain Expression Language) is LangChain’s way of composing chains declaratively using a pipe operator. Think of it as Unix pipes for LLM calls. You wire together prompts, models, and output parsers into a single chain that supports streaming, batching, and async out of the box.
Here’s the fastest way to get started:
| |
That | operator is the whole idea. Each component receives the output of the previous one. The prompt formats your input into messages, the model generates a response, and the parser pulls out the string. No wrapper classes, no inheritance hierarchies.
Why LCEL Over Legacy Chains
LangChain’s old LLMChain and SequentialChain classes still work, but they’re officially in maintenance mode. LCEL replaces them with something better in every way: automatic streaming support, native async, built-in retries, and a consistent interface across all components.
My recommendation: use LCEL for everything new. The old chain classes add abstraction without adding value.
The Pipe Operator in Detail
Every LCEL component implements the Runnable interface, which gives you .invoke(), .stream(), .batch(), and their async counterparts. When you pipe two runnables together with |, you get a RunnableSequence that chains their execution.
| |
RunnableLambda wraps any function into a runnable. This is how you inject custom logic – validation, transformation, logging – anywhere in the chain.
Parallel Branches with RunnableParallel
Sometimes you need to run multiple operations on the same input simultaneously. RunnableParallel handles this:
| |
Both chains run concurrently. The output is a dictionary keyed by the names you assign. This is genuinely useful for fan-out patterns where you need multiple LLM perspectives on the same input.
RunnablePassthrough for Input Forwarding
A common pattern: you want to pass the original input alongside a computed value. RunnablePassthrough forwards whatever it receives unchanged.
| |
This is the standard RAG pattern in LCEL. The retriever fetches documents while RunnablePassthrough forwards the original question. Both feed into the prompt template as separate variables.
You can also use RunnablePassthrough.assign() to add new keys to an existing dictionary without losing the original data:
| |
Streaming Output
LCEL chains stream by default. You don’t need to configure anything special – just call .stream() instead of .invoke():
| |
Each token arrives as it’s generated. The StrOutputParser passes chunks through transparently. This works because LCEL components implement a streaming protocol – intermediate steps that can’t stream (like a retriever) will emit their full output as a single chunk, while streamable components (like chat models) emit token by token.
For async streaming in a web server:
| |
Error Handling in Chains
Chains fail. Models time out, rate limits hit, output parsing breaks. LCEL gives you .with_retry() and .with_fallbacks() to handle this without try/except spaghetti:
| |
This tries GPT-4o up to three times with exponential backoff. If all retries fail, it falls back to GPT-4o-mini. Attach .with_retry() to the specific component that’s flaky, not the whole chain.
Common Errors
TypeError: Expected a Runnable, callable or dict – You tried to pipe something that isn’t a Runnable. Wrap plain functions with RunnableLambda() before using them in a chain.
KeyError in prompt template – Your chain passes a dictionary that’s missing a variable the prompt expects. Check that upstream components output all required keys. Use RunnablePassthrough.assign() to add missing keys.
OutputParserException – The model returned text that doesn’t match your parser’s expected format. This happens constantly with JsonOutputParser. Either add format instructions to your prompt or use PydanticOutputParser with a retry parser wrapping it.
Streaming returns one big chunk – If you placed a non-streaming component (like RunnableLambda doing a database call) after the model, it blocks the stream. Reorder your chain so streaming-compatible components come last.
NotImplementedError: astream not implemented – You’re calling .astream() on a custom runnable that only implements sync methods. Override _atransform or _ainvoke in your custom class.
When LCEL Isn’t the Right Call
LCEL works best for linear or fan-out workflows. If you need loops, conditional branching, or human-in-the-loop steps, look at LangGraph instead. LCEL chains are DAGs – they execute once, start to finish. The moment you need a cycle or state machine, you’ve outgrown LCEL.
For simple one-shot LLM calls, LCEL is also overkill. Just call the model directly. Don’t wrap a single API call in three pipes because it looks clean.
Related Guides
- How to Build AI Apps with the Vercel AI SDK and Next.js
- How to Build AI Search Pipelines with Haystack
- How to Build Serverless AI Workflows with Modal
- How to Build AI Assistants with the Cohere API
- How to Use the Anthropic Token Streaming API for Real-Time UIs
- How to Use the Anthropic Extended Thinking API for Complex Reasoning
- How to Use the Anthropic Message Batches API for Async Workloads
- How to Use the Anthropic Tool Use API for Agentic Workflows
- How to Use the OpenRouter API for Multi-Provider LLM Access
- How to Use the OpenAI Agents SDK