Agentic RAG

in 2024

2024-12-04 Arize @ GitHub

What are we talking about?

  • What is LlamaIndex
  • Why you should use it
  • What can it do
    • Retrieval augmented generation
    • World class parsing
    • Agents and multi-agent systems

What is LlamaIndex?

Python: docs.llamaindex.ai

TypeScript: ts.llamaindex.ai

LlamaParse

cloud.llamaindex.ai

 

Free for 1000 pages/day!

LlamaCloud

2. Get on the waitlist!

bit.ly/llamacloud-waitlist

1. Sign up

cloud.llamaindex.ai

LlamaHub

llamahub.ai

  • Data loaders
  • Embedding models
  • Vector stores
  • LLMs
  • Agent tools
  • Pre-built strategies
  • More!

Why LlamaIndex?

  • Build faster
  • Skip the boilerplate
  • Avoid early pitfalls
  • Get best practices for free
  • Go from prototype to production

What can LlamaIndex

do for me?

Why RAG

is necessary

How RAG works

Basic RAG pipeline

5 line starter

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()

index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

response = query_engine.query("What did the author do growing up?")

print(response)

Limitations of RAG

  1. Summarization
  2. Comparison
  3. Multi-part questions

Naive RAG failure points:

RAG is necessary

but not sufficient

Two ways

to improve RAG:

  1. Improve your data
  2. Improve your querying

What is an agent anyway?

  • Semi-autonomous software
  • Accepts a goal
  • Uses tools to achieve that goal
  • Exact steps to resolution not specified

RAG pipeline

⚠️ Single-shot
⚠️ No query understanding/planning
⚠️ No tool use
⚠️ No reflection, error correction
⚠️ No memory (stateless)

Agentic RAG

✅ Multi-turn
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection
✅ Memory for personalization

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

and then go further

  • Routing
  • Memory
  • Planning
  • Tool use

Agentic strategies

  • Multi-turn
  • Reasoning
  • Reflection

Full agent

3 agent

reasoning loops

  1. Sequential
  2. DAG-based
  3. Tree-based

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Thanks!

Follow me on BlueSky:

@seldo.com

Please don't add me on LinkedIn.

RAG in 2024 (Arize @ GitHub)

By seldo

RAG in 2024 (Arize @ GitHub)

  • 181