Advanced RAG Techniques with LlamaIndex
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976327/LlamaLogoSmall.png)
2024-04-16 Unstructured Data Meetup
What are we talking about?
- What is RAG?
- What is LlamaIndex
- The stages of RAG
- Ingestion
- Indexing
- Storing
- Querying
- Advanced querying strategies x7
- Getting into production
RAG recap
- Retrieve most relevant data
- Augment query with context
- Generate response
A solution to limited context windows
You have to be selective
and that's tricky
Accuracy
RAG challenges:
Faithfulness
RAG challenges:
Recency
RAG challenges:
Provenance
RAG challenges:
How do we do RAG?
1. Keyword search
How do we do RAG?
2. Structured queries
How do we do RAG?
3. Vector search
Vector embeddings
Turning words into numbers
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11034530/rag-1.png)
Search by meaning
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11034532/rag-2.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11034534/rag-3.png)
What is LlamaIndex?
- OSS libraries in Python and TypeScript
- LlamaParse - PDF parsing as a service
- LlamaCloud - managed ingestion service
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159873/Screenshot_2024-02-29_at_2.51.56_PM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159873/Screenshot_2024-02-29_at_2.51.56_PM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159873/Screenshot_2024-02-29_at_2.51.56_PM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159873/Screenshot_2024-02-29_at_2.51.56_PM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159873/Screenshot_2024-02-29_at_2.51.56_PM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159873/Screenshot_2024-02-29_at_2.51.56_PM.png)
Supported LLMs
5 line starter
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What's up?")
print(response)
LlamaHub
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11156431/Screenshot_2024-02-28_at_11.46.55_AM.png)
LlamaParse
part of LlamaCloud
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader
parser = LlamaParse(
result_type="markdown"
)
file_extractor = {".pdf": parser}
reader = SimpleDirectoryReader(
"./data",
file_extractor=file_extractor
)
documents = reader.load_data()
Supported embedding models
- OpenAI
- Langchain
- CohereAI
- Qdrant FastEmbed
- Gradient
- Azure OpenAI
- Elasticsearch
- Clarifai
- LLMRails
- Google PaLM
- Jina
- Voyage
...plus everything on Hugging Face!
Supported Vector databases
- Apache Cassandra
- Astra DB
- Azure Cognitive Search
- Azure CosmosDB
- BaiduVector DB
- ChatGPT Retrieval Plugin
- Chroma
- DashVector
- Databricks
- Deeplake
- DocArray
- DuckDB
- DynamoDB
- Elasticsearch
- FAISS
- Jaguar
- LanceDB
- Lantern
- Metal
- MongoDB Atlas
- MyScale
- Milvus / Zilliz
- Neo4jVector
- OpenSearch
- Pinecone
- Postgres
- pgvecto.rs
- Qdrant
- Redis
- Rockset
- Simple
- SingleStore
- Supabase
- Tair
- TiDB
- TencentVectorDB
- Timescale
- Typesense
- Upstash
- Weaviate
Advanced query strategies
SubQuestionQueryEngine
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159876/Screenshot_2024-02-29_at_2.59.03_PM.png)
Problems with precision
Small-to-big retrieval
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11159887/Screenshot_2024-02-29_at_3.06.42_PM.png)
Precision through preprocessing
Metadata support
- Apache Cassandra
- Astra DB
- Azure AI Search
- BaiduVector DB
- Chroma
- DashVector
- Databricks
- Deeplake
- DocArray
- DuckDB
- Elasticsearch
- Qdrant
- Redis
- Simple
- SingleStore
- Supabase
- Tair
- TiDB
- TencentVectorDB
- Timescale
- Typesense
- Weaviate
- Jaguar
- LanceDB
- Lantern
- Metal
- MongoDB Atlas
- MyScale
- Milvus / Zilliz
- OpenSearch
- Pinecone
- Postgres
- pgvecto.rs
Hybrid Search
Hybrid search support
- Azure Cognitive Search
- BaiduVector DB
- DashVector
- Elasticsearch
- Jaguar
- Lantern
- MyScale
- OpenSearch
- Pinecone
- Postgres
- pgvecto.rs
- Qdrant
- TencentVectorDB
- Weaviate
Text to SQL
Multi-document agents
SECinsights.ai
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10991915/Screenshot_2023-12-11_at_9.32.45_PM.png)
Composability
"2024 is the year of LlamaIndex in production"
– Shawn "swyx" Wang, Latent.Space podcast
npx create-llama
What next?
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976327/LlamaLogoSmall.png)
Follow me on Twitter: @seldo
Advanced RAG techniques lightning talk
By seldo
Advanced RAG techniques lightning talk
- 379