LangChain is the most popular framework for building applications powered by large language models (LLMs). If you want to build AI apps that go beyond a simple chatbot — apps that can search documents, call APIs, remember conversations — LangChain is the tool you need.

This guide takes you from zero to a working AI application. No prior LangChain experience needed. Just basic Python.

1. What Is LangChain?

LangChain is an open-source Python framework (also JavaScript) for building applications with LLMs. It provides reusable components — chains, agents, memory, retrievers — that you compose together to build complex AI workflows.

What LangChain Solves:

Think of LangChain as LEGO for AI applications. Each piece does one thing well. You snap them together to build something bigger.

2. Installing LangChain

pip install langchain langchain-openai
ⓘ Version note: This guide uses LangChain 0.3.x and langchain-openai 0.2.x. API changes between minor versions are common. Check python.langchain.com for migration guides if you encounter import errors.

You also need an OpenAI API key. Set it as an environment variable:

# macOS / Linux
export OPENAI_API_KEY="sk-your-key-here"

# Windows (Command Prompt)
set OPENAI_API_KEY=sk-your-key-here

# Windows (PowerShell)
$env:OPENAI_API_KEY="sk-your-key-here"

3. Your First Chain: A Simple LLM Call

The simplest LangChain application is a single LLM call wrapped in a chain:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# 1. Create the LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# 2. Create a prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant that explains concepts simply."),
    ("user", "Explain {topic} in one paragraph.")
])

# 3. Create an output parser
output_parser = StrOutputParser()

# 4. Build the chain: prompt → llm → parser
chain = prompt | llm | output_parser

# 5. Run it
result = chain.invoke({"topic": "quantum computing"})
print(result)

That pipe (|) operator is LangChain's LCEL (LangChain Expression Language). It connects components in a clean, readable way. The data flows left to right: input → prompt → llm → parser → output.

4. Adding Memory: A Conversational Chatbot

A single LLM call has no memory. Every time you ask something, it starts fresh. To build a chatbot that remembers the conversation, you need memory.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# Store conversation history (in production, use a database)
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

# Prompt with a placeholder for history
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("user", "{input}")
])

chain = prompt | ChatOpenAI(model="gpt-4o-mini") | StrOutputParser()

# Wrap the chain with memory
chain_with_memory = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

# First message
response1 = chain_with_memory.invoke(
    {"input": "My name is Alice."},
    config={"configurable": {"session_id": "user-123"}}
)
print(response1)

# Second message — it remembers!
response2 = chain_with_memory.invoke(
    {"input": "What is my name?"},
    config={"configurable": {"session_id": "user-123"}}
)
print(response2)  # "Your name is Alice."

The key piece is RunnableWithMessageHistory. It automatically loads the conversation history from get_session_history, injects it into the prompt, and saves new messages after each call.

5. Building a RAG App: Chat with Your Documents

The most powerful LangChain pattern is RAG (Retrieval-Augmented Generation). It lets you ask questions about your own documents — PDFs, text files, web pages — and get answers grounded in that content.

⚠ Watch out: RAG pipelines consume tokens for both the retrieved documents and the user question. Each query can use 2-5x the tokens of a plain chat message. Test on small documents first and monitor your API usage.

Step-by-Step RAG Pipeline:

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
from langchain_core.runnables import RunnablePassthrough

# 1. Load documents
loader = TextLoader("./my_document.txt")
documents = loader.load()

# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
chunks = text_splitter.split_documents(documents)

# 3. Create embeddings and store in a vector database
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)

# 4. Create a retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# 5. Build a prompt that uses retrieved context
template = """Answer the question based only on the following context:

{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

llm = ChatOpenAI(model="gpt-4o-mini")

# 6. Build the RAG chain
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# 7. Ask questions!
answer = rag_chain.invoke("What is the main topic of my document?")
print(answer)

How RAG Works:

  1. Load: Read documents (TXT, PDF, web pages, etc.)
  2. Split: Break documents into small chunks
  3. Embed: Convert each chunk into a numerical vector (embedding)
  4. Store: Save vectors in a vector database (FAISS, Chroma, Pinecone)
  5. Retrieve: When the user asks a question, find the most similar chunks
  6. Generate: Feed chunks + the question to the LLM for a grounded answer

6. When to Use LangChain vs Plain API Calls

Use LangChain When Use Plain API Calls When
You need multi-step LLM workflows (chains) You just need a single LLM call
You want to query your own documents (RAG) You want maximum control and minimal abstraction
You need conversation memory You are building a simple prototype
You want to use agents with tools You want to avoid dependency overhead
You plan to switch LLM providers later You are locked into one provider

7. Where to Go Next

LangChain has a steep learning curve. Learn the three patterns in this guide — chain, memory, RAG — and you build 80% of real-world AI applications.

8. Common Mistakes

8.1. Hardcoding the API Key

Embedding your OpenAI key directly in source code creates a security risk and makes the key visible in version control. Use environment variables or a .env file with python-dotenv.

# BAD: key in source code
llm = ChatOpenAI(api_key="sk-...")

# GOOD: from environment
from dotenv import load_dotenv
load_dotenv()
llm = ChatOpenAI()  # reads OPENAI_API_KEY automatically

8.2. Ignoring the Vector Store Persistence

By default, FAISS.from_documents creates an in-memory index. Each time your script restarts, it re-embeds all documents — slow and wasteful. Save the index to disk after creation and load it on subsequent runs.

# Save after creation
vectorstore.save_local("faiss_index")

# Load instead of re-creating
vectorstore = FAISS.load_local("faiss_index", embeddings)

8.3. Using the Wrong Chunk Size

Chunks that are too large drown the LLM in irrelevant text. Chunks that are too small lose context. A chunk size of 500-1000 characters with 10-20% overlap works well for most documents. Experiment with your specific content.

Frequently Asked Questions

Do I need an OpenAI API key to use LangChain?

No. LangChain supports many LLM providers: Anthropic, Google Gemini, Cohere, local models via Ollama, and more. You can use any provider with an API key.

Is LangChain free?

LangChain itself is free and open-source. You pay for the LLM API calls (OpenAI, Anthropic, etc.) separately. Using local models via Ollama is completely free.

Should I learn LangChain or just use the OpenAI API directly?

Start with the OpenAI API for simple projects. Add LangChain when you need chains, memory, RAG, or agents. It is a tool, not a religion — use it when it helps.