LlamaIndex

and Graph RAG

2024-11-26 Memgraph Webinar

What are we talking about?

What is LlamaIndex
What is RAG
How graph RAG works
Going beyond RAG to agents
Building agentic workflows

What is LlamaIndex?

Python: docs.llamaindex.ai

TypeScript: ts.llamaindex.ai

LlamaParse

cloud.llamaindex.ai

Free for 1000 pages/day!

LlamaCloud

2. Get on the waitlist!

bit.ly/llamacloud-waitlist

1. Sign up

cloud.llamaindex.ai

LlamaHub

llamahub.ai

Data loaders
Embedding models
Vector stores
LLMs
Agent tools
Pre-built strategies
More!

Why LlamaIndex?

Build faster
Skip the boilerplate
Avoid early pitfalls
Get best practices for free
Go from prototype to production

Q: What does LlamaIndex actually do?

A: Agentic RAG

Why RAG

is necessary

Sematic search

Ways to do RAG #1:

Text to SQL

Ways to do RAG #2:

Graph RAG

Ways to do RAG #3:

Basic RAG pipeline

Constructing a graph

Vector retrieval

Forms of graph RAG retrieval #1:

Text to Cypher

Forms of graph RAG retrieval #2:

Synonym retrieval

Forms of graph RAG retrieval #3:

Limitations of RAG

Naive RAG failure points #1:

Summarization

Naive RAG failure points #2:

Comparison

Naive RAG failure points #3:

Multi-part questions

RAG is necessary

but not sufficient

Two ways

to improve RAG:

Improve your data
Improve your querying

What is an agent anyway?

Semi-autonomous software
Accepts a goal
Uses tools to achieve that goal
Exact steps to resolution not specified

RAG pipeline

⚠️ Single-shot
⚠️ No query understanding/planning
⚠️ No tool use
⚠️ No reflection, error correction
⚠️ No memory (stateless)

Agentic RAG

✅ Multi-turn
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection
✅ Memory for personalization

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

and then go further

Routing
Memory
Planning
Tool use

Agentic strategies

Multi-turn
Reasoning
Reflection

Full agent

3 agent

reasoning loops

Sequential
DAG-based
Tree-based

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Workflows

bit.ly/li-workflows

Why workflows?

Workflows primer

from llama_index.llms.openai import OpenAI

class OpenAIGenerator(Workflow):
    @step
    async def generate(self, ev: StartEvent) -> StopEvent:
        query = ev.get("query")
        llm = OpenAI()
        response = await llm.acomplete(query)
        return StopEvent(result=str(response))

w = OpenAIGenerator(timeout=10, verbose=False)
result = await w.run(query="What's LlamaIndex?")
print(result)

bit.ly/li-workflows

Looping

class LoopExampleFlow(Workflow):

    @step
    async def answer_query(self, ev: StartEvent | QueryEvent ) -> FailedEvent | StopEvent:
        query = ev.query
        # try to answer the query
        random_number = random.randint(0, 1)
        if (random_number == 0):
            return FailedEvent(error="Failed to answer the query.")
        else:
            return StopEvent(result="The answer to your query")
        
    @step
    async def improve_query(self, ev: FailedEvent) -> QueryEvent | StopEvent:
        # improve the query or decide it can't be fixed
        random_number = random.randint(0, 1)
        if (random_number == 0):
            return QueryEvent(query="Here's a better query.")
        else:
            return StopEvent(result="Your query can't be fixed.")

l = LoopExampleFlow(timeout=10, verbose=True)
result = await l.run(query="What's LlamaIndex?")
print(result)

Visualization

draw_all_possible_flows()

Keeping state

class RAGWorkflow(Workflow):
    @step
    async def ingest(self, ctx: Context, ev: StartEvent) -> Optional[StopEvent]:
        dataset_name = ev.dataset
        documents = SimpleDirectoryReader("data").load_data()
        ctx.set("INDEX", VectorStoreIndex.from_documents(documents=documents))
        return StopEvent(result=f"Indexed {len(documents)} documents.")
        
    ...

Customizability

class MyWorkflow(RAGWorkflow):
    @step
    def rerank(
        self, ctx: Context, ev: Union[RetrieverEvent, StartEvent]
    ) -> Optional[QueryResult]:
        # my custom reranking logic here
        
 
w = MyWorkflow(timeout=60, verbose=True)
result = await w.run(query="Who is Paul Graham?")

Workflows enable arbitrarily complex applications

Recap

What is LlamaIndex
What is RAG
How graph RAG works
Going beyond RAG to agents
Building agentic workflows

What's next?

bit.ly/llamaindex-discord

Thanks!

Follow me on BlueSky:

@seldo.com

Please don't add me on LinkedIn.

bit.ly/llamaindex-graph-rag