Langchain on David Lang

FastAPI + LangChain: Building Production-Ready AI APIs

Fri, 05 Sep 2025 00:00:00 +0000

FastAPI’s async support and automatic OpenAPI docs pair naturally with LangChain for production AI backends.

Project Structure

app/
  main.py
  routers/chat.py
  services/rag.py
  models/schemas.py

Async Endpoint

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class ChatRequest(BaseModel):
    message: str

@app.post("/chat")
async def chat(req: ChatRequest):
    result = await rag_chain.ainvoke({"input": req.message})
    return {"answer": result["answer"]}

Production Checklist

Rate limiting, API keys, structured logging, health checks, timeout on LLM calls, background tasks for long ingest jobs.

Building AI Agents with LangChain and OpenAI

Sun, 28 Jul 2024 00:00:00 +0000

AI agents loop: observe, plan, act with tools, observe again. They handle multi-step tasks like research, booking, or code changes.

Tool Definition

from langchain.tools import tool

@tool
def search_docs(query: str) -> str:
    '''Search internal documentation.'''
    return vector_store.similarity_search(query, k=5)

Bind tools to the model; the model returns tool calls you execute and feed back as observations.

Control and Safety

Set max iterations. Require human approval for destructive tools. Log every step for audit.

Conclusion

Agents are powerful and unpredictable. Start with a fixed workflow (chain); graduate to agents when the task path genuinely varies per request.

Introduction to LangChain: Building AI-Powered Apps

Wed, 08 Mar 2023 00:00:00 +0000

LangChain composes LLM calls with prompts, memory, tools, and retrieval. It standardizes patterns that every AI app eventually needs.

Chains and Prompts

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4")
prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer as a senior engineer."),
    ("user", "{question}"),
])
chain = prompt | llm
response = chain.invoke({"question": "What is RAG?"})

Retrieval

Load documents, chunk text, embed with OpenAI or open models, store in a vector DB, and retrieve relevant chunks at query time-foundation for RAG systems.