GenAI agents in 2024 with LlamaIndex

2024-07-16 YouTube

What are we talking about?

  • What is LlamaIndex
  • Why you should use it
  • What can it do
    • Retrieval augmented generation
    • World class parsing
    • Agents and multi-agent systems

What is LlamaIndex?

Python: docs.llamaindex.ai

TypeScript: ts.llamaindex.ai

LlamaParse

part of cloud.llamaindex.ai

Free for 1000 pages/day!

LlamaCloud

1. Sign up

cloud.llamaindex.ai

 

2. Get on the waitlist

bit.ly/llamacloud-waitlist

LlamaHub

llamahub.ai

  • Data loaders
  • Embedding models
  • Vector stores
  • LLMs
  • Agent tools
  • Pre-built strategies
  • More!

Why LlamaIndex?

  • Build faster
  • Skip the boilerplate
  • Avoid early pitfalls
  • Get into production
  • Deliver real value

What can LlamaIndex

do for me?

RAG explanation:

bit.ly/li-rag-explained

Loading

RAG, step 1:

documents = SimpleDirectoryReader("data").load_data()

Parsing

RAG, step 2:

(LlamaParse: it's really good. Really!)

# must have a LLAMA_CLOUD_API_KEY
# bring in deps
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

# set up parser
parser = LlamaParse(
    result_type="markdown"  # "text" also available
)

# use SimpleDirectoryReader to parse our file
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
	input_files=['data/canada.pdf'],
    file_extractor=file_extractor
).load_data()
print(documents)

Embedding

RAG, step 3:

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

Storing

RAG, step 4:

index = VectorStoreIndex.from_documents(documents)

Retrieving

RAG, step 5:

retriever = index.as_retriever()
nodes = retriever.retrieve("Who is Paul Graham?")

Querying

RAG, step 6:

query_engine = index.as_query_engine()
response = query_engine.query("Who is Paul Graham?")

Multi-modal

npx create-llama

LlamaBot

A slack bot

Agents

Putting together an agent

def multiply(a: float, b: float) -> float:
    """Multiply two numbers and returns the product"""
    return a * b


multiply_tool = FunctionTool.from_defaults(fn=multiply)

llm = OpenAI(model="gpt-4o", temperature=0.4)

agent = ReActAgent.from_tools(
  [multiply_tool], 
  llm=llm, 
  verbose=True
)

RAG + Agents

budget_tool = QueryEngineTool.from_defaults(
    query_engine,
    name="canadian_budget_2023",
    description="A RAG engine with some basic facts",
)

llm = OpenAI(model="gpt-4o", temperature=0.4)

agent = ReActAgent.from_tools(
  [budget_tool], 
  llm=llm, 
  verbose=True
)

Multi-agent concierge

Orchestration and continuation

Deploying with llama-agents

Multi-agent monitor

Setting up

# create an agent
def get_the_secret_fact() -> str:
    """Returns the secret fact."""
    return "The secret fact is: A baby llama is called a 'Cria'."


tool = FunctionTool.from_defaults(fn=get_the_secret_fact)

worker1 = FunctionCallingAgentWorker.from_tools([tool], llm=OpenAI())
worker2 = FunctionCallingAgentWorker.from_tools([], llm=OpenAI())
agent1 = worker1.as_agent()
agent2 = worker2.as_agent()

Deploying agents

message_queue = SimpleMessageQueue()
queue_client = message_queue.client

agent_server_1 = AgentService(
    agent=agent1,
    message_queue=queue_client,
    description="Useful for getting the secret fact.",
    service_name="secret_fact_agent",
    host="127.0.0.1",
    port=8002,
)
agent_server_2 = AgentService(
    agent=agent2,
    message_queue=queue_client,
    description="Useful for getting random facts.",
    service_name="random_fact_agent",
    host="127.0.0.1",
    port=8003,
)

Deploy control plane

control_plane = ControlPlaneServer(
    message_queue=queue_client,
    orchestrator=AgentOrchestrator(llm=OpenAI()),
)

launcher = ServerLauncher(
    [agent_server_1, agent_server_2],
    control_plane,
    message_queue,
    additional_consumers=[],
)

launcher.launch_servers()

Monitor:

llama-agents monitor --control-plane-url http://127.0.0.1:8000

Thanks!

Follow me on Twitter:

@seldo

Please don't add me on LinkedIn.

GenAI agents in 2024 with LlamaIndex (YouTube edition)

By seldo

GenAI agents in 2024 with LlamaIndex (YouTube edition)

  • 285