RAG in 2024:

Advancing to Agents

2024-10-30 ODSC West

What are we talking about?

What is LlamaIndex
Why you should use it
What can it do
- Retrieval augmented generation
- World class parsing
- Agents and multi-agent systems

What is LlamaIndex?

Python: docs.llamaindex.ai

TypeScript: ts.llamaindex.ai

LlamaParse

World's best parser of complex documents

Free for 1000 pages/day!

cloud.llamaindex.ai

LlamaCloud

Turn-key RAG API for Enterprises

Available as SaaS or private cloud deployment

1. Sign up:

cloud.llamaindex.ai

2. Get on the waitlist!

bit.ly/llamacloud-waitlist

LlamaHub

llamahub.ai

Data loaders
Embedding models
Vector stores
LLMs
Agent tools
Pre-built strategies
More!

Why LlamaIndex?

Build faster
Skip the boilerplate
Avoid early pitfalls
Get best practices for free
Go from prototype to production

What can LlamaIndex

do for me?

Why RAG

is necessary

How RAG works

Basic RAG pipeline

5 line starter

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()

index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

response = query_engine.query("What did the author do growing up?")

print(response)

npx create-llama

Limitations of RAG

Summarization
Comparison
Multi-part questions

Naive RAG failure points:

RAG is necessary

but not sufficient

Two ways

to improve RAG:

Improve your data
Improve your querying

What is an agent anyway?

Semi-autonomous software
Accepts a goal
Uses tools to achieve that goal
Exact steps to resolution not specified

RAG pipeline

⚠️ Single-shot
⚠️ No query understanding/planning
⚠️ No tool use
⚠️ No reflection, error correction
⚠️ No memory (stateless)

Agentic RAG

✅ Multi-turn
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection
✅ Memory for personalization

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

and then go further

Routing
Memory
Planning
Tool use

Agentic strategies

Multi-turn
Reasoning
Reflection

Full agent

3 agent

reasoning loops

Sequential
DAG-based
Tree-based

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Workflows

bit.ly/li-workflows

Why workflows?

Workflows primer

from llama_index.llms.openai import OpenAI

class OpenAIGenerator(Workflow):
    @step()
    async def generate(self, ev: StartEvent) -> StopEvent:
        query = ev.get("query")
        llm = OpenAI()
        response = await llm.acomplete(query)
        return StopEvent(result=str(response))

w = OpenAIGenerator(timeout=10, verbose=False)
result = await w.run(query="What's LlamaIndex?")
print(result)

bit.ly/li-workflows

Visualization

draw_all_possible_flows()

Workflows enable arbitrarily complex applications

Multi-agent concierge

github.com/run-llama/multi-agent-concierge

Deploying agents

to production

pip install llama-deploy

Agents as microservices

Try out llama-deploy

github.com/run-llama/llama_deploy

Recap

What is LlamaIndex
- LlamaCloud, LlamaHub, create-llama
Why RAG is necessary
How to build RAG in LlamaIndex
Limitations of RAG
Agentic RAG
- Routing, memory, planning, tool use
Reasoning patterns
- Sequential, DAG-based, tree-based
Workflows
- Loops, state, customizability
Deploying workflows

What's next?

bit.ly/llamaindex-discord

Thanks!

Follow me on BlueSky:

@seldo.com

bit.ly/rag-in-2024

RAG in 2024: Advancing to Agents (ODSC West)

By seldo

RAG in 2024: Advancing to Agents (ODSC West)

RAG in 2024:

Advancing to Agents

What are we talking about?

What is LlamaIndex?

LlamaParse

LlamaCloud

LlamaHub

Why LlamaIndex?

What can LlamaIndex

do for me?

Why RAG

is necessary

How RAG works

Basic RAG pipeline

5 line starter

npx create-llama

Limitations of RAG

Naive RAG failure points:

RAG is necessary

but not sufficient

Two ways

to improve RAG:

What is an agent anyway?

RAG pipeline

Agentic RAG

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

3 agent

reasoning loops

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Workflows

Why workflows?

Workflows primer

Visualization

Workflows enable arbitrarily complex applications

Multi-agent concierge

Deploying agents

to production

Agents as microservices

Try out llama-deploy

Recap

What's next?

Thanks!

RAG in 2024: Advancing to Agents (ODSC West)

More from seldo