LlamaIndex

and Graph RAG

2024-11-26 Memgraph Webinar

What are we talking about?

  • What is LlamaIndex
  • What is RAG
  • How graph RAG works
  • Going beyond RAG to agents
  • Building agentic workflows

What is LlamaIndex?

Python: docs.llamaindex.ai

TypeScript: ts.llamaindex.ai

LlamaParse

cloud.llamaindex.ai

 

Free for 1000 pages/day!

LlamaCloud

2. Get on the waitlist!

bit.ly/llamacloud-waitlist

1. Sign up

cloud.llamaindex.ai

LlamaHub

llamahub.ai

  • Data loaders
  • Embedding models
  • Vector stores
  • LLMs
  • Agent tools
  • Pre-built strategies
  • More!

Why LlamaIndex?

  • Build faster
  • Skip the boilerplate
  • Avoid early pitfalls
  • Get best practices for free
  • Go from prototype to production

Q: What does LlamaIndex actually do?

A: Agentic RAG

Why RAG

is necessary

Sematic search

Ways to do RAG #1:

Text to SQL

Ways to do RAG #2:

Graph RAG

Ways to do RAG #3:

Basic RAG pipeline

Constructing a graph

Vector retrieval

Forms of graph RAG retrieval #1:

Text to Cypher

Forms of graph RAG retrieval #2:

Synonym retrieval

Forms of graph RAG retrieval #3:

Limitations of RAG

Naive RAG failure points #1:

Summarization

Naive RAG failure points #2:

Comparison

Naive RAG failure points #3:

Multi-part questions

RAG is necessary

but not sufficient

Two ways

to improve RAG:

  1. Improve your data
  2. Improve your querying

What is an agent anyway?

  • Semi-autonomous software
  • Accepts a goal
  • Uses tools to achieve that goal
  • Exact steps to resolution not specified

RAG pipeline

⚠️ Single-shot
⚠️ No query understanding/planning
⚠️ No tool use
⚠️ No reflection, error correction
⚠️ No memory (stateless)

Agentic RAG

✅ Multi-turn
✅ Query / task planning layer
✅ Tool interface for external environment
✅ Reflection
✅ Memory for personalization

From simple to advanced agents

Routing

Conversation memory

Query planning

Tool use

Tools unleash the power of LLMs

Combine agentic strategies

and then go further

  • Routing
  • Memory
  • Planning
  • Tool use

Agentic strategies

  • Multi-turn
  • Reasoning
  • Reflection

Full agent

3 agent

reasoning loops

  1. Sequential
  2. DAG-based
  3. Tree-based

Sequential reasoning

DAG-based reasoning

Self reflection

Tree-based reasoning

Exploration vs exploitation

Workflows

Why workflows?

Workflows primer

from llama_index.llms.openai import OpenAI

class OpenAIGenerator(Workflow):
    @step
    async def generate(self, ev: StartEvent) -> StopEvent:
        query = ev.get("query")
        llm = OpenAI()
        response = await llm.acomplete(query)
        return StopEvent(result=str(response))

w = OpenAIGenerator(timeout=10, verbose=False)
result = await w.run(query="What's LlamaIndex?")
print(result)

Looping

class LoopExampleFlow(Workflow):

    @step
    async def answer_query(self, ev: StartEvent | QueryEvent ) -> FailedEvent | StopEvent:
        query = ev.query
        # try to answer the query
        random_number = random.randint(0, 1)
        if (random_number == 0):
            return FailedEvent(error="Failed to answer the query.")
        else:
            return StopEvent(result="The answer to your query")
        
    @step
    async def improve_query(self, ev: FailedEvent) -> QueryEvent | StopEvent:
        # improve the query or decide it can't be fixed
        random_number = random.randint(0, 1)
        if (random_number == 0):
            return QueryEvent(query="Here's a better query.")
        else:
            return StopEvent(result="Your query can't be fixed.")

l = LoopExampleFlow(timeout=10, verbose=True)
result = await l.run(query="What's LlamaIndex?")
print(result)

Visualization

draw_all_possible_flows()

Keeping state

class RAGWorkflow(Workflow):
    @step
    async def ingest(self, ctx: Context, ev: StartEvent) -> Optional[StopEvent]:
        dataset_name = ev.dataset
        documents = SimpleDirectoryReader("data").load_data()
        ctx.set("INDEX", VectorStoreIndex.from_documents(documents=documents))
        return StopEvent(result=f"Indexed {len(documents)} documents.")
        
    ...

Customizability

class MyWorkflow(RAGWorkflow):
    @step
    def rerank(
        self, ctx: Context, ev: Union[RetrieverEvent, StartEvent]
    ) -> Optional[QueryResult]:
        # my custom reranking logic here
        
 
w = MyWorkflow(timeout=60, verbose=True)
result = await w.run(query="Who is Paul Graham?")

Workflows enable arbitrarily complex applications

Recap

  • What is LlamaIndex
  • What is RAG
  • How graph RAG works
  • Going beyond RAG to agents
  • Building agentic workflows

What's next?

Thanks!

Follow me on BlueSky:

@seldo.com

Please don't add me on LinkedIn.