<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Ai on David Lang</title>
    <link>https://www.davidlang.tech/tags/ai/</link>
    <description>Recent content in Ai on David Lang</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Fri, 10 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://www.davidlang.tech/tags/ai/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Context Window Strategies: Making the Most of Long-Context LLMs</title>
      <link>https://www.davidlang.tech/posts/context-window-strategies-making-the-most-of-long-context-llms/</link>
      <pubDate>Fri, 10 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/context-window-strategies-making-the-most-of-long-context-llms/</guid>
      <description>&lt;p&gt;Million-token context windows tempt teams to dump entire repos into prompts. That is expensive, slow, and often less accurate than targeted retrieval.&lt;/p&gt;&#xA;&lt;h2 id=&#34;when-full-context-helps&#34;&gt;When Full Context Helps&lt;/h2&gt;&#xA;&lt;p&gt;Single-file refactors, analyzing one large document, comparing a few long contracts.&lt;/p&gt;&#xA;&lt;h2 id=&#34;when-retrieval-wins&#34;&gt;When Retrieval Wins&lt;/h2&gt;&#xA;&lt;p&gt;Whole codebases, ticket backlogs, and wiki sites-embed, filter, rerank, then pass top-k chunks.&lt;/p&gt;&#xA;&lt;h2 id=&#34;compression-techniques&#34;&gt;Compression Techniques&lt;/h2&gt;&#xA;&lt;p&gt;Summarize conversation history. Use hierarchical memory (session summary + recent turns). Strip comments and generated noise from code context.&lt;/p&gt;</description>
    </item>
    <item>
      <title>The State of AI Coding Assistants in 2026</title>
      <link>https://www.davidlang.tech/posts/the-state-of-ai-coding-assistants-in-2026/</link>
      <pubDate>Thu, 15 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/the-state-of-ai-coding-assistants-in-2026/</guid>
      <description>&lt;p&gt;By 2026, AI coding assistants are standard in professional workflows-not experiments. The landscape consolidated around a few patterns: inline completion, IDE agents, and terminal agents.&lt;/p&gt;&#xA;&lt;h2 id=&#34;market-snapshot&#34;&gt;Market Snapshot&lt;/h2&gt;&#xA;&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; leads among developers who want an AI-native editor with codebase-wide context. &lt;strong&gt;GitHub Copilot&lt;/strong&gt; remains the enterprise default tied to GitHub and Microsoft ecosystems. &lt;strong&gt;Claude Code&lt;/strong&gt; and similar terminal agents dominate backend and automation workflows. &lt;strong&gt;Windsurf, Cody, and others&lt;/strong&gt; compete on price and niche features.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Evaluating LLM Outputs: RAGAS, DeepEval, and Custom Metrics</title>
      <link>https://www.davidlang.tech/posts/evaluating-llm-outputs-ragas-deepeval-and-custom-metrics/</link>
      <pubDate>Sat, 18 Oct 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/evaluating-llm-outputs-ragas-deepeval-and-custom-metrics/</guid>
      <description>&lt;p&gt;Frameworks like RAGAS and DeepEval codify LLM evaluation metrics so you can regression-test prompts and pipelines in CI.&lt;/p&gt;&#xA;&lt;h2 id=&#34;ragas-rag-assessment&#34;&gt;RAGAS (RAG Assessment)&lt;/h2&gt;&#xA;&lt;p&gt;Measures context precision/recall, faithfulness, and answer relevance-ideal for retrieval pipelines.&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;from&lt;/span&gt; ragas &lt;span style=&#34;color:#719e07&#34;&gt;import&lt;/span&gt; evaluate&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;from&lt;/span&gt; ragas.metrics &lt;span style=&#34;color:#719e07&#34;&gt;import&lt;/span&gt; faithfulness, answer_relevancy&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;result &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; evaluate(dataset&lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt;eval_dataset, metrics&lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt;[faithfulness, answer_relevancy])&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;deepeval&#34;&gt;DeepEval&lt;/h2&gt;&#xA;&lt;p&gt;Offers pytest-style LLM tests, G-Eval, and hallucination metrics with CI integration.&lt;/p&gt;&#xA;&lt;h2 id=&#34;custom-metrics&#34;&gt;Custom Metrics&lt;/h2&gt;&#xA;&lt;p&gt;Domain-specific checks often outperform generic scores-JSON schema match, SQL execution success, unit test pass rate for codegen.&lt;/p&gt;</description>
    </item>
    <item>
      <title>FastAPI &#43; LangChain: Building Production-Ready AI APIs</title>
      <link>https://www.davidlang.tech/posts/fastapi-langchain-building-production-ready-ai-apis/</link>
      <pubDate>Fri, 05 Sep 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/fastapi-langchain-building-production-ready-ai-apis/</guid>
      <description>&lt;p&gt;FastAPI&amp;rsquo;s async support and automatic OpenAPI docs pair naturally with LangChain for production AI backends.&lt;/p&gt;&#xA;&lt;h2 id=&#34;project-structure&#34;&gt;Project Structure&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-fallback&#34; data-lang=&#34;fallback&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;app/&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  main.py&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  routers/chat.py&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  services/rag.py&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  models/schemas.py&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;async-endpoint&#34;&gt;Async Endpoint&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;from&lt;/span&gt; fastapi &lt;span style=&#34;color:#719e07&#34;&gt;import&lt;/span&gt; FastAPI&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;from&lt;/span&gt; pydantic &lt;span style=&#34;color:#719e07&#34;&gt;import&lt;/span&gt; BaseModel&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;app &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; FastAPI()&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;class&lt;/span&gt; &lt;span style=&#34;color:#268bd2&#34;&gt;ChatRequest&lt;/span&gt;(BaseModel):&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    message: &lt;span style=&#34;color:#b58900&#34;&gt;str&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;@app.post&lt;/span&gt;(&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;/chat&amp;#34;&lt;/span&gt;)&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;async&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#268bd2&#34;&gt;chat&lt;/span&gt;(req: ChatRequest):&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    result &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;await&lt;/span&gt; rag_chain&lt;span style=&#34;color:#719e07&#34;&gt;.&lt;/span&gt;ainvoke({&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;input&amp;#34;&lt;/span&gt;: req&lt;span style=&#34;color:#719e07&#34;&gt;.&lt;/span&gt;message})&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#719e07&#34;&gt;return&lt;/span&gt; {&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;answer&amp;#34;&lt;/span&gt;: result[&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;answer&amp;#34;&lt;/span&gt;]}&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;production-checklist&#34;&gt;Production Checklist&lt;/h2&gt;&#xA;&lt;p&gt;Rate limiting, API keys, structured logging, health checks, timeout on LLM calls, background tasks for long ingest jobs.&lt;/p&gt;</description>
    </item>
    <item>
      <title>AI-First Development: Rethinking Your Engineering Workflow</title>
      <link>https://www.davidlang.tech/posts/ai-first-development-rethinking-your-engineering-workflow/</link>
      <pubDate>Tue, 22 Jul 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/ai-first-development-rethinking-your-engineering-workflow/</guid>
      <description>&lt;p&gt;AI-first development means designing processes assuming LLMs and agents participate in design, implementation, and review-not bolting a chatbot onto waterfall.&lt;/p&gt;&#xA;&lt;h2 id=&#34;shifts-in-practice&#34;&gt;Shifts in Practice&lt;/h2&gt;&#xA;&lt;p&gt;&lt;strong&gt;Specs&lt;/strong&gt; - Write acceptance criteria LLMs can verify (tests, schemas).&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt; - Smaller modules with clear boundaries agents can reason about.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Reviews&lt;/strong&gt; - AI first pass, human mandatory for security and product judgment.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Documentation&lt;/strong&gt; - Keep &lt;code&gt;AGENTS.md&lt;/code&gt; or rules files current so tools understand conventions.&lt;/p&gt;&#xA;&lt;h2 id=&#34;team-rituals&#34;&gt;Team Rituals&lt;/h2&gt;&#xA;&lt;p&gt;Start stories with a prompt draft. Pair with AI for spikes; human pair for production-critical paths. Track AI-assisted PR defect rates.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Claude Code: Agentic AI Coding from the Terminal</title>
      <link>https://www.davidlang.tech/posts/claude-code-agentic-ai-coding-from-the-terminal/</link>
      <pubDate>Mon, 30 Jun 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/claude-code-agentic-ai-coding-from-the-terminal/</guid>
      <description>&lt;p&gt;Claude Code brings agentic coding to the terminal-read files, edit code, run tests, and commit changes through natural language, powered by Anthropic&amp;rsquo;s models.&lt;/p&gt;&#xA;&lt;h2 id=&#34;workflow&#34;&gt;Workflow&lt;/h2&gt;&#xA;&lt;p&gt;Run from your repository root. Ask for features or fixes in plain language. Claude Code explores the tree, proposes edits, and executes commands with your approval.&lt;/p&gt;&#xA;&lt;h2 id=&#34;strengths&#34;&gt;Strengths&lt;/h2&gt;&#xA;&lt;p&gt;Strong on refactors spanning many files, understanding build errors from test output, and following git history. Terminal-native fits backend and DevOps workflows.&lt;/p&gt;</description>
    </item>
    <item>
      <title>MCP (Model Context Protocol): The Future of AI Tool Integration</title>
      <link>https://www.davidlang.tech/posts/mcp-model-context-protocol-the-future-of-ai-tool-integration/</link>
      <pubDate>Tue, 08 Apr 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/mcp-model-context-protocol-the-future-of-ai-tool-integration/</guid>
      <description>&lt;p&gt;Model Context Protocol (MCP) standardizes how AI applications connect to data sources and tools-filesystems, databases, APIs, and IDEs speak a common protocol.&lt;/p&gt;&#xA;&lt;h2 id=&#34;why-mcp-matters&#34;&gt;Why MCP Matters&lt;/h2&gt;&#xA;&lt;p&gt;Before MCP, every agent framework invented its own plugin format. MCP provides discoverable tools and resources with typed schemas-like LSP for AI tools.&lt;/p&gt;&#xA;&lt;h2 id=&#34;architecture&#34;&gt;Architecture&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;strong&gt;Host&lt;/strong&gt; - Cursor, Claude Desktop, custom agent&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;MCP Server&lt;/strong&gt; - Exposes tools (&lt;code&gt;query_db&lt;/code&gt;, &lt;code&gt;read_file&lt;/code&gt;) and resources&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Transport&lt;/strong&gt; - stdio or SSE&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;Developers implement servers once; any MCP-compatible host can use them.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Improving LLM Accuracy: Techniques Beyond Prompt Engineering</title>
      <link>https://www.davidlang.tech/posts/improving-llm-accuracy-techniques-beyond-prompt-engineering/</link>
      <pubDate>Tue, 25 Mar 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/improving-llm-accuracy-techniques-beyond-prompt-engineering/</guid>
      <description>&lt;p&gt;When prompts plateau, these engineering levers move accuracy more than another adjective in the system message.&lt;/p&gt;&#xA;&lt;h2 id=&#34;better-retrieval&#34;&gt;Better Retrieval&lt;/h2&gt;&#xA;&lt;p&gt;Hybrid search (BM25 + vectors), rerankers (Cohere, cross-encoders), and metadata filters reduce wrong context reaching the model.&lt;/p&gt;&#xA;&lt;h2 id=&#34;structured-outputs&#34;&gt;Structured Outputs&lt;/h2&gt;&#xA;&lt;p&gt;Force JSON with schemas (Zod, Pydantic, OpenAI structured outputs). Parse failures trigger retry with repair prompts.&lt;/p&gt;&#xA;&lt;h2 id=&#34;model-routing&#34;&gt;Model Routing&lt;/h2&gt;&#xA;&lt;p&gt;Small models classify intent; large models answer hard questions. Cuts cost and reduces overconfident rambling on simple queries.&lt;/p&gt;</description>
    </item>
    <item>
      <title>How to Validate and Measure LLM Accuracy in Production</title>
      <link>https://www.davidlang.tech/posts/how-to-validate-and-measure-llm-accuracy-in-production/</link>
      <pubDate>Tue, 18 Feb 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/how-to-validate-and-measure-llm-accuracy-in-production/</guid>
      <description>&lt;p&gt;Shipping an LLM feature without measurement is shipping a bug generator. Production validation combines automated metrics, human review, and business KPIs.&lt;/p&gt;&#xA;&lt;h2 id=&#34;levels-of-evaluation&#34;&gt;Levels of Evaluation&lt;/h2&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;&lt;strong&gt;Unit-level&lt;/strong&gt; - Schema validation, regex checks, refusal detection&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Golden set&lt;/strong&gt; - Curated Q&amp;amp;A pairs scored automatically&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Online&lt;/strong&gt; - User thumbs, task completion, support escalations&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Human&lt;/strong&gt; - Expert rubrics on sampled traffic&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;h2 id=&#34;metrics-that-matter&#34;&gt;Metrics That Matter&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;strong&gt;Faithfulness&lt;/strong&gt; - Answer grounded in retrieved context (RAG)&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Relevance&lt;/strong&gt; - Addresses the user question&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Toxicity / PII&lt;/strong&gt; - Safety filters&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Latency and cost&lt;/strong&gt; - p95 tokens and dollars per session&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;implementation-sketch&#34;&gt;Implementation Sketch&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#268bd2&#34;&gt;validate_response&lt;/span&gt;(answer: &lt;span style=&#34;color:#b58900&#34;&gt;str&lt;/span&gt;, context: &lt;span style=&#34;color:#b58900&#34;&gt;str&lt;/span&gt;) &lt;span style=&#34;color:#719e07&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#b58900&#34;&gt;dict&lt;/span&gt;:&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#719e07&#34;&gt;return&lt;/span&gt; {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;has_citation&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;[source:&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;in&lt;/span&gt; answer,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;length_ok&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#2aa198&#34;&gt;50&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#b58900&#34;&gt;len&lt;/span&gt;(answer) &lt;span style=&#34;color:#719e07&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;4000&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;grounded&amp;#34;&lt;/span&gt;: entailment_score(context, answer) &lt;span style=&#34;color:#719e07&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;0.7&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Log scores to your observability stack (Datadog, LangSmith, Phoenix).&lt;/p&gt;</description>
    </item>
    <item>
      <title>Cursor vs GitHub Copilot vs Claude Code: The AI Coding Assistant Showdown</title>
      <link>https://www.davidlang.tech/posts/cursor-vs-github-copilot-vs-claude-code-the-ai-coding-assistant-showdown/</link>
      <pubDate>Fri, 10 Jan 2025 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/cursor-vs-github-copilot-vs-claude-code-the-ai-coding-assistant-showdown/</guid>
      <description>&lt;p&gt;AI coding assistants evolved from inline completions to agentic editors. Cursor, GitHub Copilot, and Claude Code represent three philosophies-knowing the differences helps you pick the right tool per task.&lt;/p&gt;&#xA;&lt;h2 id=&#34;github-copilot&#34;&gt;GitHub Copilot&lt;/h2&gt;&#xA;&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; Deep IDE integration (VS Code, JetBrains), inline Tab completion, Copilot Chat, enterprise policies, broad language support.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Day-to-day completion inside your existing editor, teams already on GitHub, minimal workflow change.&lt;/p&gt;&#xA;&lt;h2 id=&#34;cursor&#34;&gt;Cursor&lt;/h2&gt;&#xA;&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; AI-native editor (VS Code fork), multi-file edits, Composer agent, codebase indexing, rules and &lt;code&gt;.cursorrules&lt;/code&gt; for project context, integrated terminal agent.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Multi-Modal AI: Working with Images and Text</title>
      <link>https://www.davidlang.tech/posts/multi-modal-ai-working-with-images-and-text/</link>
      <pubDate>Tue, 05 Nov 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/multi-modal-ai-working-with-images-and-text/</guid>
      <description>&lt;p&gt;Multi-modal models accept images and text in one request-enabling document OCR, UI screenshot analysis, and visual Q&amp;amp;A.&lt;/p&gt;&#xA;&lt;h2 id=&#34;vision-api-example&#34;&gt;Vision API Example&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-typescript&#34; data-lang=&#34;typescript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; response &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;await&lt;/span&gt; openai.chat.completions.create({&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;gpt-4o&amp;#39;&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  messages&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; [&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      role&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      content&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; [&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        { &lt;span style=&#34;color:#268bd2&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;text&amp;#39;&lt;/span&gt;, text&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;What error is shown in this screenshot?&amp;#39;&lt;/span&gt; },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        { &lt;span style=&#34;color:#268bd2&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;image_url&amp;#39;&lt;/span&gt;, image_url&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; { url: &lt;span style=&#34;color:#dc322f&#34;&gt;imageDataUrl&lt;/span&gt; } },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ],&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ],&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;});&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;use-cases&#34;&gt;Use Cases&lt;/h2&gt;&#xA;&lt;p&gt;Receipt parsing, diagram explanation, accessibility alt-text generation, and visual regression triage.&lt;/p&gt;</description>
    </item>
    <item>
      <title>AI-Powered Code Review: Integrating LLMs into Dev Workflows</title>
      <link>https://www.davidlang.tech/posts/ai-powered-code-review-integrating-llms-into-dev-workflows/</link>
      <pubDate>Sun, 22 Sep 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/ai-powered-code-review-integrating-llms-into-dev-workflows/</guid>
      <description>&lt;p&gt;LLMs can summarize diffs, flag security smells, and suggest tests-but they should augment human review, not replace it.&lt;/p&gt;&#xA;&lt;h2 id=&#34;ci-integration&#34;&gt;CI Integration&lt;/h2&gt;&#xA;&lt;p&gt;Post PR diffs to an LLM with a structured prompt. Output JSON findings consumed by GitHub Actions or GitLab CI. Fail builds only on high-severity, high-confidence issues to reduce noise.&lt;/p&gt;&#xA;&lt;h2 id=&#34;prompt-design-for-reviews&#34;&gt;Prompt Design for Reviews&lt;/h2&gt;&#xA;&lt;p&gt;Include: changed files, diff hunks, coding standards doc, and explicit instruction to cite line numbers and avoid nits.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Claude API vs OpenAI API: Choosing the Right LLM</title>
      <link>https://www.davidlang.tech/posts/claude-api-vs-openai-api-choosing-the-right-llm/</link>
      <pubDate>Wed, 14 Aug 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/claude-api-vs-openai-api-choosing-the-right-llm/</guid>
      <description>&lt;p&gt;Anthropic&amp;rsquo;s Claude and OpenAI&amp;rsquo;s GPT families both offer strong APIs. Choosing between them depends on task, context length, cost, and compliance-not benchmark hype alone.&lt;/p&gt;&#xA;&lt;h2 id=&#34;strengths-at-a-glance&#34;&gt;Strengths at a Glance&lt;/h2&gt;&#xA;&lt;p&gt;&lt;strong&gt;Claude&lt;/strong&gt; - Long context windows, careful refusals, strong long-document analysis and coding reviews.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;OpenAI&lt;/strong&gt; - Broad ecosystem, function calling maturity, image and audio modalities, largest third-party integration surface.&lt;/p&gt;&#xA;&lt;h2 id=&#34;integration-pattern&#34;&gt;Integration Pattern&lt;/h2&gt;&#xA;&lt;p&gt;Abstract the provider behind an interface:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-typescript&#34; data-lang=&#34;typescript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;interface&lt;/span&gt; LLMProvider {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  chat(messages: &lt;span style=&#34;color:#dc322f&#34;&gt;Message&lt;/span&gt;[])&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; Promise&amp;lt;&lt;span style=&#34;color:#268bd2&#34;&gt;string&lt;/span&gt;&amp;gt;;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Swap implementations per route (cheap model for classification, premium for generation).&lt;/p&gt;</description>
    </item>
    <item>
      <title>Fine-Tuning LLMs: When and How to Customize AI Models</title>
      <link>https://www.davidlang.tech/posts/fine-tuning-llms-when-and-how-to-customize-ai-models/</link>
      <pubDate>Wed, 15 May 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/fine-tuning-llms-when-and-how-to-customize-ai-models/</guid>
      <description>&lt;p&gt;Fine-tuning adapts a base model to your domain with labeled examples. Use it when prompting and RAG cannot achieve consistent style, format, or task-specific behavior.&lt;/p&gt;&#xA;&lt;h2 id=&#34;when-to-fine-tune&#34;&gt;When to Fine-Tune&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;Fixed output schema (legal clauses, medical codes)&lt;/li&gt;&#xA;&lt;li&gt;Brand voice across thousands of responses&lt;/li&gt;&#xA;&lt;li&gt;Specialized terminology poorly covered by general models&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;when-not-to-fine-tune&#34;&gt;When Not to Fine-Tune&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;Facts that change frequently (use RAG)&lt;/li&gt;&#xA;&lt;li&gt;One-off tasks (use prompting)&lt;/li&gt;&#xA;&lt;li&gt;Small datasets without validation (risk overfitting)&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;openai-fine-tuning-flow&#34;&gt;OpenAI Fine-Tuning Flow&lt;/h2&gt;&#xA;&lt;p&gt;Prepare JSONL with &lt;code&gt;messages&lt;/code&gt; arrays. Upload, create job, evaluate on a holdout set. Monitor loss and human ratings before promoting to production.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Vector Databases: Pinecone, Weaviate, and Chroma Compared</title>
      <link>https://www.davidlang.tech/posts/vector-databases-pinecone-weaviate-and-chroma-compared/</link>
      <pubDate>Mon, 22 Apr 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/vector-databases-pinecone-weaviate-and-chroma-compared/</guid>
      <description>&lt;p&gt;Vector databases store embeddings and perform similarity search-the retrieval layer in RAG and recommendation systems.&lt;/p&gt;&#xA;&lt;h2 id=&#34;comparison&#34;&gt;Comparison&lt;/h2&gt;&#xA;&lt;table&gt;&#xA;  &lt;thead&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;th&gt;&lt;/th&gt;&#xA;          &lt;th&gt;Pinecone&lt;/th&gt;&#xA;          &lt;th&gt;Weaviate&lt;/th&gt;&#xA;          &lt;th&gt;Chroma&lt;/th&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/thead&gt;&#xA;  &lt;tbody&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Hosting&lt;/td&gt;&#xA;          &lt;td&gt;Managed cloud&lt;/td&gt;&#xA;          &lt;td&gt;Self-host or cloud&lt;/td&gt;&#xA;          &lt;td&gt;Embedded / local&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Best for&lt;/td&gt;&#xA;          &lt;td&gt;Production scale&lt;/td&gt;&#xA;          &lt;td&gt;Hybrid search + GraphQL&lt;/td&gt;&#xA;          &lt;td&gt;Prototyping&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Ops burden&lt;/td&gt;&#xA;          &lt;td&gt;Low&lt;/td&gt;&#xA;          &lt;td&gt;Medium&lt;/td&gt;&#xA;          &lt;td&gt;Low&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/tbody&gt;&#xA;&lt;/table&gt;&#xA;&lt;h2 id=&#34;pgvector-alternative&#34;&gt;pgvector Alternative&lt;/h2&gt;&#xA;&lt;p&gt;PostgreSQL with pgvector keeps vectors beside relational data-excellent when you already run Postgres and need ACID transactions.&lt;/p&gt;&#xA;&lt;h2 id=&#34;selection-criteria&#34;&gt;Selection Criteria&lt;/h2&gt;&#xA;&lt;p&gt;Consider QPS, filtering (metadata predicates), hybrid keyword + vector search, cost, and data residency. Prototype on Chroma or pgvector; migrate to Pinecone or Weaviate at scale.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Streaming AI Responses with OpenAI API in Next.js</title>
      <link>https://www.davidlang.tech/posts/streaming-ai-responses-with-openai-api-in-nextjs/</link>
      <pubDate>Sat, 30 Mar 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/streaming-ai-responses-with-openai-api-in-nextjs/</guid>
      <description>&lt;p&gt;Streaming improves chat UX by showing tokens as they are generated. Next.js Route Handlers make it straightforward to proxy streams securely.&lt;/p&gt;&#xA;&lt;h2 id=&#34;route-handler&#34;&gt;Route Handler&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-typescript&#34; data-lang=&#34;typescript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#586e75&#34;&gt;// app/api/chat/route.ts&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;import&lt;/span&gt; OpenAI &lt;span style=&#34;color:#268bd2&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;openai&amp;#39;&lt;/span&gt;;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;export&lt;/span&gt; &lt;span style=&#34;color:#268bd2&#34;&gt;async&lt;/span&gt; &lt;span style=&#34;color:#268bd2&#34;&gt;function&lt;/span&gt; POST(req: &lt;span style=&#34;color:#dc322f&#34;&gt;Request&lt;/span&gt;) {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; { messages } &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;await&lt;/span&gt; req.json();&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; openai &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;new&lt;/span&gt; OpenAI();&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; stream &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;await&lt;/span&gt; openai.chat.completions.create({&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    model&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;gpt-4&amp;#39;&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    messages,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    stream: &lt;span style=&#34;color:#dc322f&#34;&gt;true&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  });&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; encoder &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;new&lt;/span&gt; TextEncoder();&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; readable &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;new&lt;/span&gt; ReadableStream({&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#268bd2&#34;&gt;async&lt;/span&gt; start(controller) {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#719e07&#34;&gt;for&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;await&lt;/span&gt; (&lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; chunk &lt;span style=&#34;color:#719e07&#34;&gt;of&lt;/span&gt; stream) {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; text &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; chunk.choices[&lt;span style=&#34;color:#2aa198&#34;&gt;0&lt;/span&gt;]&lt;span style=&#34;color:#719e07&#34;&gt;?&lt;/span&gt;.delta&lt;span style=&#34;color:#719e07&#34;&gt;?&lt;/span&gt;.content &lt;span style=&#34;color:#719e07&#34;&gt;||&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;&amp;#39;&lt;/span&gt;;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#719e07&#34;&gt;if&lt;/span&gt; (text) controller.enqueue(encoder.encode(text));&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      }&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      controller.close();&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  });&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#719e07&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;new&lt;/span&gt; Response(readable, {&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    headers&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; { &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;text/plain; charset=utf-8&amp;#39;&lt;/span&gt; },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  });&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;client-consumption&#34;&gt;Client Consumption&lt;/h2&gt;&#xA;&lt;p&gt;Use &lt;code&gt;fetch&lt;/code&gt; with a reader loop or libraries like Vercel AI SDK&amp;rsquo;s &lt;code&gt;useChat&lt;/code&gt; for React state management.&lt;/p&gt;</description>
    </item>
    <item>
      <title>GitHub Copilot in Practice: AI-Assisted Development</title>
      <link>https://www.davidlang.tech/posts/github-copilot-in-practice-ai-assisted-development/</link>
      <pubDate>Sun, 25 Feb 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/github-copilot-in-practice-ai-assisted-development/</guid>
      <description>&lt;p&gt;GitHub Copilot suggests code inline and in chat, trained on public repositories. Used well, it accelerates boilerplate; used blindly, it introduces subtle bugs.&lt;/p&gt;&#xA;&lt;h2 id=&#34;effective-workflows&#34;&gt;Effective Workflows&lt;/h2&gt;&#xA;&lt;p&gt;Write descriptive function names and docstrings-Copilot uses them as prompts. Accept suggestions in tests and CRUD handlers; scrutinize auth, crypto, and SQL.&lt;/p&gt;&#xA;&lt;h2 id=&#34;tab-vs-chat&#34;&gt;Tab vs Chat&lt;/h2&gt;&#xA;&lt;p&gt;Inline completions excel for repetitive patterns. Copilot Chat handles explanations, refactors, and multi-file questions inside VS Code and JetBrains.&lt;/p&gt;&#xA;&lt;h2 id=&#34;team-policies&#34;&gt;Team Policies&lt;/h2&gt;&#xA;&lt;p&gt;Define what code can be sent to cloud models. Some organizations restrict Copilot on regulated codebases. Review license implications for generated code.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Building RAG Systems: Retrieval-Augmented Generation Explained</title>
      <link>https://www.davidlang.tech/posts/building-rag-systems-retrieval-augmented-generation-explained/</link>
      <pubDate>Thu, 18 Jan 2024 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/building-rag-systems-retrieval-augmented-generation-explained/</guid>
      <description>&lt;p&gt;RAG grounds LLM responses in your private data by retrieving relevant documents before generation. It reduces hallucinations and keeps answers current without retraining models.&lt;/p&gt;&#xA;&lt;h2 id=&#34;pipeline-overview&#34;&gt;Pipeline Overview&lt;/h2&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;&lt;strong&gt;Ingest&lt;/strong&gt; - Load PDFs, wikis, tickets into chunks (500–1000 tokens).&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Embed&lt;/strong&gt; - Convert chunks to vectors with an embedding model.&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Store&lt;/strong&gt; - Save vectors in Pinecone, pgvector, or Chroma.&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Retrieve&lt;/strong&gt; - On query, embed the question and find top-k similar chunks.&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Generate&lt;/strong&gt; - Pass chunks as context to the LLM.&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;context &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;.join(retrieved_chunks)&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;prompt &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;Use only this context:&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{context}&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Question: {user_query}&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;chunking-strategy&#34;&gt;Chunking Strategy&lt;/h2&gt;&#xA;&lt;p&gt;Overlap chunks by 10–20% to avoid cutting sentences. Metadata (source, page) helps citations and debugging.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Prompt Engineering Fundamentals for Developers</title>
      <link>https://www.davidlang.tech/posts/prompt-engineering-fundamentals-for-developers/</link>
      <pubDate>Sat, 14 Oct 2023 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/prompt-engineering-fundamentals-for-developers/</guid>
      <description>&lt;p&gt;Prompt engineering is the practice of designing inputs so LLMs produce reliable, useful outputs. Developers who treat prompts as code ship better AI features.&lt;/p&gt;&#xA;&lt;h2 id=&#34;structure-your-prompts&#34;&gt;Structure Your Prompts&lt;/h2&gt;&#xA;&lt;p&gt;Use clear sections: role, context, task, format, and constraints.&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-fallback&#34; data-lang=&#34;fallback&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;You are a code reviewer for a TypeScript React codebase.&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Context: PR diff below.&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Task: List bugs, security issues, and style problems.&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Format: JSON array of { severity, file, message }.&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Constraints: Max 10 items. No speculation beyond the diff.&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;few-shot-examples&#34;&gt;Few-Shot Examples&lt;/h2&gt;&#xA;&lt;p&gt;Include 2–3 input/output pairs for classification or extraction tasks. Examples beat lengthy instructions for format adherence.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Introduction to LangChain: Building AI-Powered Apps</title>
      <link>https://www.davidlang.tech/posts/introduction-to-langchain-building-ai-powered-apps/</link>
      <pubDate>Wed, 08 Mar 2023 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/introduction-to-langchain-building-ai-powered-apps/</guid>
      <description>&lt;p&gt;LangChain composes LLM calls with prompts, memory, tools, and retrieval. It standardizes patterns that every AI app eventually needs.&lt;/p&gt;&#xA;&lt;h2 id=&#34;chains-and-prompts&#34;&gt;Chains and Prompts&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;from&lt;/span&gt; langchain_openai &lt;span style=&#34;color:#719e07&#34;&gt;import&lt;/span&gt; ChatOpenAI&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#719e07&#34;&gt;from&lt;/span&gt; langchain_core.prompts &lt;span style=&#34;color:#719e07&#34;&gt;import&lt;/span&gt; ChatPromptTemplate&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;llm &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; ChatOpenAI(model&lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;gpt-4&amp;#34;&lt;/span&gt;)&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;prompt &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; ChatPromptTemplate&lt;span style=&#34;color:#719e07&#34;&gt;.&lt;/span&gt;from_messages([&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    (&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;system&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;Answer as a senior engineer.&amp;#34;&lt;/span&gt;),&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    (&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#2aa198&#34;&gt;{question}&lt;/span&gt;&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;&lt;/span&gt;),&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;])&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;chain &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; prompt &lt;span style=&#34;color:#719e07&#34;&gt;|&lt;/span&gt; llm&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;response &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; chain&lt;span style=&#34;color:#719e07&#34;&gt;.&lt;/span&gt;invoke({&lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;question&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#34;What is RAG?&amp;#34;&lt;/span&gt;})&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;retrieval&#34;&gt;Retrieval&lt;/h2&gt;&#xA;&lt;p&gt;Load documents, chunk text, embed with OpenAI or open models, store in a vector DB, and retrieve relevant chunks at query time-foundation for RAG systems.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Getting Started with the OpenAI API in Node.js</title>
      <link>https://www.davidlang.tech/posts/getting-started-with-the-openai-api-in-nodejs/</link>
      <pubDate>Thu, 12 Jan 2023 00:00:00 +0000</pubDate>
      <guid>https://www.davidlang.tech/posts/getting-started-with-the-openai-api-in-nodejs/</guid>
      <description>&lt;p&gt;The OpenAI API brought large language models to application developers through a simple HTTP interface. Node.js remains a natural fit for BFF layers that call LLMs.&lt;/p&gt;&#xA;&lt;h2 id=&#34;installation-and-first-request&#34;&gt;Installation and First Request&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;npm install openai&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-typescript&#34; data-lang=&#34;typescript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;import&lt;/span&gt; OpenAI &lt;span style=&#34;color:#268bd2&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;openai&amp;#39;&lt;/span&gt;;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; client &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;new&lt;/span&gt; OpenAI({ apiKey: &lt;span style=&#34;color:#dc322f&#34;&gt;process.env.OPENAI_API_KEY&lt;/span&gt; });&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#268bd2&#34;&gt;const&lt;/span&gt; completion &lt;span style=&#34;color:#719e07&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#719e07&#34;&gt;await&lt;/span&gt; client.chat.completions.create({&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;gpt-4&amp;#39;&lt;/span&gt;,&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  messages&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; [&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    { role&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;system&amp;#39;&lt;/span&gt;, content&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;You are a helpful coding assistant.&amp;#39;&lt;/span&gt; },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    { role&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;, content&lt;span style=&#34;color:#719e07&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#2aa198&#34;&gt;&amp;#39;Explain async/await in JavaScript.&amp;#39;&lt;/span&gt; },&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ],&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;});&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;console.log(completion.choices[&lt;span style=&#34;color:#2aa198&#34;&gt;0&lt;/span&gt;].message.content);&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;production-considerations&#34;&gt;Production Considerations&lt;/h2&gt;&#xA;&lt;p&gt;Never expose API keys in frontend bundles. Proxy requests through your backend. Set &lt;code&gt;max_tokens&lt;/code&gt;, timeouts, and retry policies. Log token usage for cost control.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
