
Every developer starting with LLMs usually begins the same way:
You call OpenAI, Gemini, or Groq’s API directly — send a prompt, get a response, and return “Hello world.”
But as soon as your application grows into something stateful, multi-provider, retrieval-based, or agent-driven, raw API code becomes messy, repetitive, and difficult to scale.
That’s where LangChain solves the problem:
A unified, flexible, scalable framework that lets you build advanced AI systems without drowning in boilerplate.
In this article, you’ll see:
Let’s start with the reality of raw LLM integration.
Working directly with OpenAI, Gemini, and Groq APIs works fine for small demos, but becomes a real challenge when you build larger, multi-provider applications. Each provider has its own quirks, meaning you spend more time managing API differences than building features.
Here’s why:
Every provider expects a different payload structure.
OpenAI/Groq use messages[], while Gemini uses contents[].parts[].
Switching providers means rewriting your entire request body.
OpenAI returns choices[0].message.content,
Gemini returns candidates[0].content.parts[0].text.
This forces you to write separate parsing logic for each provider.
You must create separate TypeScript interfaces for each LLM’s request/response shape.
As APIs evolve, you constantly update types and keep them in sync.
HTTP headers, tokens, roles, fetch calls, JSON parsing, and error handling—
you copy-paste the same logic across every provider function.
Different role systems, safety settings, error formats, streaming methods, and rate limits require custom handling per provider.
Changing models requires updating:
payload structure, response parsing, types, model names, URLs, and error logic.
As soon as you add more providers, more models, or more features, the codebase gets harder to maintain, test, scale, and refactor.
Below is your existing implementation that shows how developers traditionally integrate multiple LLMs without LangChain. This raw approach requires handling different request formats, response structures, type definitions, and model-specific behavior manually.
Important Note About OpenAI Roles:
Before the o1 models, OpenAI uses
role = "system"
After the o1 models, OpenAI requires
role = "developer"
This comes from OpenAI’s chat completion specification.
enum OpenAiRole { system = "system", user = "user", assistant = "assistant", developer = "developer", } export type IHelloOutput = { ok: true; provider: "openai" | "gemini" | "groq"; model: string; message: string prompt_tokens?: number; completion_tokens?: number; total_tokens?: number; } // Gemini request/response types follow Google’s GenerateContent API spec export type IGeminiGenerateContent = { candidates?: { content?: { parts?: { text?: string }[] } }[] usageMetadata: { promptTokenCount: number; candidatesTokenCount: number; totalTokenCount: number; } } export type IGeminiRequestContent = { contents?: { parts?: { text?: string }[] }[] } // OpenAI/Groq follow the OpenAI Chat Completions response format export type IOpenAiGenerateContent = { choices: { message: { role: OpenAiRole; content: string; } }[] usage?: { prompt_tokens: number; completion_tokens: number; total_tokens: number; } } export type IOpenAiRequestContent = { model: string; messages: { role: OpenAiRole; content: string; }[] temperature?: number; }
Gemini’s body structure follows the Google AI Studio GenerateContent endpoint, which uses
contents[].parts[].text instead of OpenAI’s messages[].
async function helloGemini(): Promise<IHelloOutput> { if (process.env.GEMINI_API_KEY === "None") { throw new Error("GEMINI_API_KEY is not set"); } const model = "gemini-2.0-flash-lite"; const url = `https://generativelanguage.googleapis.com/v1beta/models/${model}:generateContent?key=${process.env.GEMINI_API_KEY}`; const requestBody: IGeminiRequestContent = { contents: [{ parts: [{ text: "Hello world" }], }], }; const response = await fetch(url, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(requestBody), }); if (!response.ok) { throw new Error(`Gemini API request failed: ${response.statusText} : ${await response.text()}`); } const responseData: IGeminiGenerateContent = await response.json(); const message = responseData.candidates?.[0]?.content?.parts?.[0]?.text || "No message returned"; return { ok: true, provider: "gemini", model, message, prompt_tokens: responseData.usageMetadata.promptTokenCount, completion_tokens: responseData.usageMetadata.candidatesTokenCount, total_tokens: responseData.usageMetadata.totalTokenCount, }; }
Groq intentionally mirrors the OpenAI Chat Completions format.
Groq doucmentation: https://console.groq.com/docs/openai#configuring-openai-to-use-groq-api
Its documentation explicitly states that you should use the same structure as OpenAI — the only difference is the base URL.
async function helloGroq(): Promise<IHelloOutput> { if (process.env.GROQ_API_KEY === "None") { throw new Error("GROQ_API_KEY is not set"); } const url = `https://api.groq.com/openai/v1/chat/completions`; const requestBody: IOpenAiRequestContent = { model: "llama-3.1-8b-instant", messages: [ { role: OpenAiRole.developer, content: "You are a helpful assistant." }, { role: OpenAiRole.user, content: "Hello world" } ], temperature: 0.7, }; const response = await fetch(url, { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${process.env.GROQ_API_KEY}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { throw new Error(`Groq API request failed: ${response.statusText} : ${await response.text()}`); } const responseData: IOpenAiGenerateContent = await response.json(); const message = responseData.choices?.[0]?.message?.content || "No message returned"; return { ok: true, provider: "groq", model: requestBody.model, message, prompt_tokens: responseData.usage?.prompt_tokens, completion_tokens: responseData.usage?.completion_tokens, total_tokens: responseData.usage?.total_tokens, }; }
Uses the standard request/response defined in the OpenAI Chat Completions documentation.
async function helloOpenAi(): Promise<IHelloOutput> { if (process.env.OPENAI_API_KEY === "None") { throw new Error("OPENAI_API_KEY is not set"); } const url = `https://api.openai.com/v1/chat/completions`; const requestBody: IOpenAiRequestContent = { model: "llama-3.1-8b-instant", messages: [ { role: OpenAiRole.developer, content: "You are a helpful assistant." }, { role: OpenAiRole.user, content: "Hello world" } ], temperature: 0.7, }; const response = await fetch(url, { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { throw new Error(`OpenAI API request failed: ${response.statusText} : ${await response.text()}`); } const responseData: IOpenAiGenerateContent = await response.json(); const message = responseData.choices?.[0]?.message?.content || "No message returned"; return { ok: true, provider: "openai", model: requestBody.model, message, }; }
To get same structured Basic nodeJS server with typescript visit:
Building a TypeScript Express API for Vercel Deployment: A Step-by-Step Guide
import express, { Request, Response, Application } from "express"; import bodyParser from "body-parser"; import dotenv from "dotenv"; import cors from "cors"; import { helloProvider } from "./providers"; dotenv.config(); const PORT = process.env.PORT || 4000; const app: Application = express(); app.use(bodyParser.json()); app.use(cors()); app.get("/", (req: Request, res: Response) => { res.send("Assalamualikum"); }); app.get("/hello/:provider", async (req: Request, res: Response) => { const provider = req.params.provider; switch (provider) { case "openai": return res.json(await helloProvider.openai()); case "gemini": return res.json(await helloProvider.gemini()); case "groq": return res.json(await helloProvider.groq()); default: return res.status(400).json({ ok: false, error: "Unsupported provider" }); } }); app.listen(PORT, () => console.log(`Server running on port ${PORT}`));
Using direct APIs works for simple demos, but as soon as your application grows—even slightly—you begin to feel the pain of maintaining three completely different LLM providers. The more features you add, the more your code turns into duplicated logic, scattered integrations, and provider-specific exceptions.
Here’s what developers almost immediately run into:
Each provider requires its own fetch calls, headers, error checks, and JSON parsing. Most of your code ends up doing the same thing three different ways.
OpenAI/Groq use messages[], Gemini uses contents[].parts[].
OpenAI returns choices[].message.content, Gemini returns candidates[].content.parts[].text.
You end up writing custom parsing logic for every provider.
Every time you switch a model—or want to test multiple providers—you must update:
URLs, request bodies, types, parsing logic, token extraction mappings, and role behavior.
Different token fields, different system/developer role rules, different usage objects.
You must track all of these separately.
As soon as you add new features like RAG, embeddings, memory, or agents, your raw API code starts to grow uncontrollably. You end up reinventing entire frameworks just to keep your app organized.
In short: Raw API integration doesn’t scale. It only gets more complex the more providers and features you add.
This is exactly why LangChain exists—to unify providers, remove boilerplate, simplify workflows, and let you focus on building features instead of maintaining low-level API details.
With LangChain, the exact same “Hello world” logic becomes dramatically simpler.
No custom fetch calls.
No manual headers.
No API-specific request bodies.
No provider-specific parsing logic.
No duplicated types.
LangChain provides a unified chat model interface, so every provider—OpenAI, Gemini, Groq—behaves the same from your code’s perspective. This removes 80–90% of the boilerplate you wrote earlier.
Below is the equivalent implementation using LangChain.
import { ChatGoogleGenerativeAI } from "@langchain/google-genai"; const model = new ChatGoogleGenerativeAI({ model: "gemini-2.0-flash-lite", apiKey: process.env.GEMINI_API_KEY, }); const response = await model.invoke("Hello world"); console.log(response.text);
✔ No nested contents[].parts[]
✔ No endpoint URLs
✔ No manual parsing of candidates[].content.parts[]
✔ Same interface as OpenAI/Groq
import { ChatOpenAI } from "@langchain/openai"; const model = new ChatOpenAI({ model: "gpt-4o-mini", apiKey: process.env.OPENAI_API_KEY, }); const response = await model.invoke("Hello world"); console.log(response.text);
✔ No manual messages[] array
✔ No system/developer role management
✔ No response drilling into choices[0].message.content
import { ChatOpenAI } from "@langchain/openai"; const model = new ChatOpenAI({ model: "llama-3.1-8b-instant", apiKey: process.env.GROQ_API_KEY, configuration: { baseURL: "https://api.groq.com/openai/v1", } }); const response = await model.invoke("Hello world"); console.log(response.text);
✔ Groq becomes a drop-in OpenAI replacement
✔ Only difference is the baseURL
✔ No separate provider wrapper required
LangChain isn’t just a convenience upgrade—it solves the core architectural problems that appear the moment your project grows beyond a basic API call. Here’s why developers choose LangChain for production-grade LLM applications:
Every LLM—OpenAI, Gemini, Groq, or others—uses the same method structure in LangChain.
You no longer deal with:
One interface. One way to call every provider.
No more:
LangChain abstracts all the plumbing so you focus on your application logic.
Want to swap Gemini → OpenAI → Groq?
In raw APIs, that means rewriting half your file.
In LangChain, it's:
const model = new ChatOpenAI({ ... });
One line. Zero refactoring.
Perfect for testing multiple models or building meta-LLM systems.
Adding advanced features without LangChain means building everything from scratch:
LangChain gives you these components out-of-the-box—plug-and-play.
As your application grows, LangChain’s modular design keeps your code:
Instead of managing multiple provider wrappers and duplicated logic, you build on a unified, future-proof architecture.
Even though LangChain simplifies most real-world AI development, there are still situations where calling LLMs directly through their native APIs is the better choice:
If your app only sends a single prompt and returns a response, raw API calls are straightforward and require minimal setup.
Direct API calls avoid the extra abstraction layer, making them ideal for latency-sensitive use cases like real-time inference or fast UI responses.
Some applications require custom request bodies, fine-tuned headers, or provider-specific features that LangChain abstracts away.
If you're implementing unusual prompting flows, custom streaming logic, or tightly optimized pipelines, raw APIs give you more precise control.
For teams focused on minimalism, performance, or small bundle size, avoiding external frameworks can be a deliberate architectural choice.
Raw LLM APIs are excellent for quick experiments, testing ideas, or building simple one-off scripts. But as soon as your application grows—when you add multiple providers, expand features, introduce retrieval, or need maintainability—the limitations become obvious.
LangChain was built to solve these exact problems.
It eliminates:
With LangChain, your codebase becomes cleaner, your workflows become modular, and your application becomes significantly easier to scale and maintain over time. Instead of wrestling with provider differences, you can focus on delivering features—confident that your LLM layer is flexible, unified, and future-proof.
Comments
No Comments
Leave a replay
Your email address will not be publish. Required fields are marked *