Julián Duque
Developer and Educator
with Node.js
sforce.co/build-ai-apps-heroku
build-ai-apps.ukoreh.com
Principal Developer Advocate at Heroku
JSConf/NodeConf Colombia Organizer
Node.js Collaborator Emeritus
🦋 @julianduque.co
/in/juliandavidduque
X @julian_duque
💡 For the purposes of this session: AI ~ GenAI
Uses a Large Language Model (LLM) to generate or process content
Runs inference across different modalities (text, code, audio, etc.)
Integrates with tools, data, and user input
Large Language Model (LLM): A type of AI model trained on massive text data to understand and generate human-like language
Inference vs Training: Inference is using a trained model to make predictions. Training is the process of teaching a model with data
Model Size (7B / 13B / 70B): Number of parameters in the model. Larger often means better performance, but slower and more expensive
Context Window: The amount of input (in tokens) an LLM can consider at once. Limits the scope of reasoning
Fine-tuning vs Prompt Engineering: Prompting adapts output using input design. Fine-tuning customizes the model weights using new data
Evaluation: Factuality, Reasoning, Safety: Testing models across multiple axes like correctness, bias, and robustness
Chat: Stateless or contextual assistant
Retrieval Augmented Generation (RAG): Grounded responses using external data
Agent: Multi-step reasoning + tool use
Function Calling / Tools: Structured output and tool delegation
Multi-Modal Apps: Text-to-image, image captioning
Planning + Execution: Break tasks into steps
LLMs: OpenAI, Anthropic, Cohere, Google, etc.
Vector Store: pgvector (via PostgreSQL), Weviate, Pinecone, etc.
Orchestration: LangChain, LangGraph
Embedding: OpenAI, Cohere, HuggingFace APIs
Frontend: Your choice!
Open standard that enables AI applications to connect seamlessly with external data sources and tools
Think of MCP as a
"USB-C for AI applications"
GPT 4o
Claude 3.7 Sonnet
A type of database designed to store and search for data represented as vectors, enabling efficient semantic similarity searches.
-- Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;
-- Create a table with the VECTOR type
CREATE TABLE animals(id serial PRIMARY KEY, name VARCHAR(100), embedding VECTOR(100));
-- Insert embeddings
INSERT INTO animals(name, embedding) VALUES ('llama', '[-0.15647223591804504,
…
-0.7506130933761597, 0.1427040845155716]');
-- Query Data using the euclidean distance operator
=> SELECT name FROM animals WHERE name != 'shark' ORDER BY embedding <-> (SELECT embedding FROM animals WHERE name = 'shark') LIMIT 5;
name
-----------
crocodile
dolphin
whale
turtle
alligator
(5 rows)
Image source: Understanding similarity or semantic search and vector databases
Sudhir Yelikar
LangChain is a framework for developing applications powered by large language models (LLMs).
@langchain/core
Base abstractions and LangChain Expression Language
@langchain/community
3rd party integrations, document loaders, tools
langchain
Chains, agents, and retrieval strategies
Partner packages
@langchain/openai, @langchain/anthropic, @langchain/mistralai
LangGraph is a low-level orchestration framework for building controllable agents. It can be used standalone but integrates seamlessly with LangChain.
@langchain/langraph
Stateful: Maintains memory across steps
Graph-based flow: Define control logic as nodes and edges
Concurrent branches: Supports parallel reasoning paths
Retry & fallback logic: Handle failures with control flows
Agent loops: Enables Think → Act → Observe cycles
Use Case | Use LangChain.js | Use LangGraph.js |
---|---|---|
Simple chains / pipelines | ✅ Yes | ❌ Overkill |
RAG apps (chat + search) | ✅ Yes | ✅ Yes |
Agent tool use (basic) | ✅ Yes | ✅ Yes |
Complex logic / branching flows | ❌ Hard to manage | ✅ Graph-based control |
Stateful agents | ⚠️ Limited | ✅ Built-in memory per node |
Retry / fallback mechanisms | ❌ Manual | ✅ First-class feature |
Concurrency / parallel branches | ❌ Not supported | ✅ Supported |
Use LangChain.js for quick LLM integrations, chains, and simple agents.
Use LangGraph.js when your agent needs memory, branching logic, retries, or complex flows.
import { OpenAI } from "@langchain/openai";
// Create an instance of a LLM
const llm = new OpenAI({
modelName: "gpt-3.5-turbo-instruct",
temperature: 0,
});
const result = llm.invoke("What is the meaning of life?");
Single prompt → LLM → Completion
A declarative way to compose chains together.
Source: LangChain documentation
Chain: A sequence of operations or steps hat link together different components or modules to accomplish a specific task.
const chain = prompt
.pipe(llm)
.pipe(parser);
const result =
await chain.invoke({ input });
import { ChatOpenAI } from "@langchain/openai";
import { RunnableSequence } from "@langchain/core/runnables";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";
// Create an instance of a chat model
const llm = new ChatOpenAI({
modelName: "gpt-4o-mini",
temperature: 0,
});
// Create a chat prompt
const promptTemplate = ChatPromptTemplate.fromMessages([
[
"system",`You are a professional software developer who knows about {language}.
Return just the code without any explanations, and not enclosed in markdown.
You can add inline comments if necessary.`,
],
["human", "Generate code for the following use case: {problem}"],
]);
// Example of composing Runnables with pipe
const chain = promptTemplate.pipe(llm).pipe(new StringOutputParser());
// Execute the chain
const result = await chain.invoke({ language: "Python", problem: "Reverse a string" });
Stateless or Stateful (memory-enabled)
Useful for assistants, support bots, simple UX
const chain = prompt.pipe(llm).pipe(new StringOutputParser());
// Create an in-memory store for the chat history
const messageHistory = new ChatMessageHistory();
// Create a runnable with the chain and the chat history
const chainWithHistory = new RunnableWithMessageHistory({
runnable: chain,
getMessageHistory: () => messageHistory,
inputMessagesKey: "message",
historyMessagesKey: "history",
});
Single prompt → LLM → Response
Can mantain memory, make decisions, and use tools
Useful for automation, APIs, multi-step reasoning
// Create a tool to query Wikipedia
const wikipediaTool = new WikipediaQueryRun({...});
// Create a custom tool
const weatherTool = new DynamicTool({...});
// // Create a list of tools
const tools = [weatherTool, wikipediaTool];
// // Create an agent with the LLM, tools, and prompt
const agent = createToolCallingAgent({
llm,
tools,
prompt,
});
// Create an agent executor with the agent and tools
const executor = new AgentExecutor({
agent,
tools,
});
LLM plans → selects tools → executes → repeats
Embed documents into vector store using an embedding model
Retrieves relevant content
LLM uses context to generate accurate response
Query vectors → Retrieve relevant context → Ground prompt
Patterns | Example | Description |
---|---|---|
Chat + RAG | Smart FAQ Assistant | Conversational UI with grounded answers from documents |
Agent + RAG | Context-Rich Research Assistant | Retrieves and reasons over documents step by step |
Agent + Planning + APIs | Autonomous Task Runner | Executes plans using external tools and APIs |
RAG + Tool Use | Data-Aware Agent | Fetches docs and uses tools for real-time data |
Agent + Planning | Task Executor | Breaks tasks into steps and completes them with tools |
Chat + Agent + Tools | AI Concierge | Conversational system that plans, books, and responds with tools |
Function Calling + Planner | DevOps Agent | Plans and calls structured DevOps functions |
Latency
RAG, multi-step agents, and tool use can slow response times
Tooling Maturity
JavaScript ecosystem is growing fast but still lags behind Python in AI libraries
Evaluation Complexity
Measuring factuality, reasoning, and safety isn't straightforward
Model Limits
Context window size, token cost, and hallucinations affect reliability
Fast-Moving Ecosystem
Tools, models, and APIs evolve quickly. Stability and long-term support can be uncertain
Node.js is ready
LangChain.js and LangGraph.js enable full agentic AI apps
Pick the right pattern
Chat, RAG, Agent, Hybrid → match architecture to use case
Combine tools
LLMs + memory + orchestration = powerful AI workflows
Build modularly
Swap models, databases, and frontends without breaking core logic
Always bet on JavaScript
- Brendan Eich
By Julián Duque