Enterprise Retrieval-Augmented Generation with LlamaIndex
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976327/LlamaLogoSmall.png)
2024-02-08 Streamlit at GitHub
What are we talking about?
- RAG recap
- Enterprise challenges of RAG
- How do we RAG?
- The stages of RAG
- How LlamaIndex helps
- Looking forward
RAG recap
- Retrieve most relevant data
- Augment query with context
- Generate response
A solution to limited context windows
You have to be selective
and that's tricky
Accuracy
Enterprise challenges of RAG:
Faithfulness
Enterprise challenges of RAG:
Recency
Enterprise challenges of RAG:
Provenance
Enterprise challenges of RAG:
How do we do RAG?
1. Keyword search
How do we do RAG?
2. Structured queries
How do we do RAG?
3. Vector search
Vector embeddings
Turning words into numbers
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11034530/rag-1.png)
Search by meaning
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11034532/rag-2.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11034534/rag-3.png)
Hybrid approaches
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976351/stages.png)
Supported LLMs
LlamaHub
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10991876/Screenshot_2023-12-11_at_8.53.29_PM.png)
Ingestion pipeline
Supported embedding models
- OpenAI
- Langchain
- CohereAI
- Qdrant FastEmbed
- Gradient
- Azure OpenAI
- Elasticsearch
- Clarifai
- LLMRails
- Google PaLM
- Jina
- Voyage
...plus everything on Hugging Face!
Supported Vector databases
- Apache Cassandra
- Astra DB
- Azure Cognitive Search
- Azure CosmosDB
- ChatGPT Retrieval Plugin
- Chroma
- DashVector
- Deeplake
- DocArray
- DynamoDB
- Elasticsearch
- FAISS
- LanceDB
- Lantern
- Metal
- MongoDB Atlas
- MyScale
- Milvus / Zilliz
- Neo4jVector
- OpenSearch
- Pinecone
- Postgres
- pgvecto.rs
- Qdrant
- Redis
- Rockset
- SingleStore
- Supabase
- Tair
- TencentVectorDB
- Timescale
- Typesense
- Weaviate
Retrieval
Agentic strategies
That's a lot of stuff!
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
Let's do it in 5 lines of code:
npx create-llama
Create Llama Templates
SECinsights.ai
LlamaBot
LlamaHub (again)
"2024 is the year of LlamaIndex in production"
– Shawn "swyx" Wang, Latent.Space podcast
LlamaIndex in production
- Datastax
- OpenBB
- Springworks
- Gunderson Dettmer
- Jasper
- Replit
- Red Hat
- Clearbit
- Berkeley
- W&B
- Instabase
Case study:
Gunderson Dettmer
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/11101563/gunderson.png)
Recap
Retrieve, Augment, Generate
- Challenges:
- Accuracy
- Faithfulness
- Recency
- Provenance
- The stages of RAG:
- Loading
- Indexing
- Storing
- Retrieval
- Synthesis
- Processing
What now?
![](https://s3.amazonaws.com/media-p.slid.es/uploads/136956/images/10976327/LlamaLogoSmall.png)
Follow me on twitter: @seldo
Enterprise RAG with LlamaIndex
By seldo
Enterprise RAG with LlamaIndex
- 381