2024-07-13 AGI House Hacakathon
Python: docs.llamaindex.ai
TypeScript: ts.llamaindex.ai
Free for 1000 pages/day!
1. Sign up:
RAG, step 1:
documents = SimpleDirectoryReader("data").load_data()
RAG, step 2:
(LlamaParse: it's really good. Really!)
# must have a LLAMA_CLOUD_API_KEY
# bring in deps
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader
# set up parser
parser = LlamaParse(
result_type="markdown" # "text" also available
)
# use SimpleDirectoryReader to parse our file
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
input_files=['data/canada.pdf'],
file_extractor=file_extractor
).load_data()
print(documents)
RAG, step 3:
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
RAG, step 4:
index = VectorStoreIndex.from_documents(documents)
RAG, step 5:
retriever = index.as_retriever()
nodes = retriever.retrieve("Who is Paul Graham?")
RAG, step 6:
query_engine = index.as_query_engine()
response = query_engine.query("Who is Paul Graham?")
A slack bot
def multiply(a: float, b: float) -> float:
"""Multiply two numbers and returns the product"""
return a * b
multiply_tool = FunctionTool.from_defaults(fn=multiply)
llm = OpenAI(model="gpt-4o", temperature=0.4)
agent = ReActAgent.from_tools(
[multiply_tool],
llm=llm,
verbose=True
)
budget_tool = QueryEngineTool.from_defaults(
query_engine,
name="canadian_budget_2023",
description="A RAG engine with some basic facts",
)
llm = OpenAI(model="gpt-4o", temperature=0.4)
agent = ReActAgent.from_tools(
[budget_tool],
llm=llm,
verbose=True
)
All resources:
Follow me on Twitter:
@seldo
Please don't add me on LinkedIn.