Multi-agent systems

in production

2024-10-10 LlamaIndex + Activeloop meetup

What are we talking about?

What is LlamaIndex
Why you should use it
What can it do
- Retrieval augmented generation
- World class parsing
- Agents and multi-agent systems

What is LlamaIndex?

Python: docs.llamaindex.ai

TypeScript: ts.llamaindex.ai

LlamaParse

Free for 1000 pages/day!

LlamaCloud

cloud.llamaindex.ai

2. Get on the waitlist!

bit.ly/llamacloud-waitlist

1. Sign up:

LlamaHub

llamahub.ai

Data loaders
Embedding models
Vector stores
LLMs
Agent tools
Pre-built strategies
More!

Why LlamaIndex?

Build faster
Skip the boilerplate
Avoid early pitfalls
Get best practices for free
Go from prototype to production

What can LlamaIndex

do for me?

Why RAG

is necessary

How RAG works

Basic RAG pipeline

5 line starter

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()

index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

response = query_engine.query("What did the author do growing up?")

print(response)

npx create-llama

Limitations of RAG

Summarization
Comparison
Multi-part questions

Naive RAG failure points:

RAG is necessary

but not sufficient

Two ways

to improve RAG:

Improve your data
Improve your querying

What is an agent anyway?

Semi-autonomous software
Accepts a goal
Uses tools to achieve that goal
Exact steps to resolution not specified

RAG pipeline

⚠️ Single-shot
⚠️ No query understanding/planning
⚠️ No tool use
⚠️ No reflection, error correction
⚠️ No memory (stateless)

Agentic RAG

✅ Multi-turn
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection
✅ Memory for personalization

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

and then go further

Routing
Memory
Planning
Tool use

Agentic strategies

Multi-turn
Reasoning
Reflection

Full agent

3 agent

reasoning loops

Sequential
DAG-based
Tree-based

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Workflows

bit.ly/li-workflows

Why workflows?

Workflows primer

from llama_index.llms.openai import OpenAI

class OpenAIGenerator(Workflow):
    @step()
    async def generate(self, ev: StartEvent) -> StopEvent:
        query = ev.get("query")
        llm = OpenAI()
        response = await llm.acomplete(query)
        return StopEvent(result=str(response))

w = OpenAIGenerator(timeout=10, verbose=False)
result = await w.run(query="What's LlamaIndex?")
print(result)

bit.ly/li-workflows

Visualization

draw_all_possible_flows()

Workflows enable arbitrarily complex applications

Multi-agent workflow

bit.ly/li-multi-agent-workflow

Multi-agent concierge

github.com/run-llama/multi-agent-concierge

Full-stack python workflow

bit.ly/li-fullstack-python-workflow

Deploying agents

to production

pip install llama-deploy

Agents as microservices

Try out llama-deploy

github.com/run-llama/llama_deploy

Recap

What is LlamaIndex
- LlamaCloud, LlamaHub, create-llama
Why RAG is necessary
How to build RAG in LlamaIndex
Limitations of RAG
Agentic RAG
- Routing, memory, planning, tool use
Reasoning patterns
- Sequential, DAG-based, tree-based
Workflows
- Loops, state, customizability
Deploying workflows

What's next?

bit.ly/llamaindex-discord

Thanks!

Follow me on Twitter/X:

@seldo

Please don't add me on LinkedIn.

bit.ly/llamaindex-activeloop

Multi-agent systems in production (Activeloop)

By seldo

Multi-agent systems in production (Activeloop)

Multi-agent systems

in production

What are we talking about?

What is LlamaIndex?

LlamaParse

LlamaCloud

LlamaHub

Why LlamaIndex?

What can LlamaIndex

do for me?

Why RAG

is necessary

How RAG works

Basic RAG pipeline

5 line starter

npx create-llama

Limitations of RAG

Naive RAG failure points:

RAG is necessary

but not sufficient

Two ways

to improve RAG:

What is an agent anyway?

RAG pipeline

Agentic RAG

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

3 agent

reasoning loops

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Workflows

Why workflows?

Workflows primer

Visualization

Workflows enable arbitrarily complex applications

Multi-agent workflow

Multi-agent concierge

Full-stack python workflow

Deploying agents

to production

Agents as microservices

Try out llama-deploy

Recap

What's next?

Thanks!

Multi-agent systems in production (Activeloop)

More from seldo