Enterprise Retrieval-Augmented Generation with LlamaIndex

2024-02-06 Rockset webinar

What are we talking about?

  • RAG recap
  • Why we do RAG?
  • How do we RAG?
  • The stages of RAG
  • How LlamaIndex helps
  • Advanced querying strategies
  • Looking forward

RAG recap

Why RAG?

  • Retrieve most relevant data
  • Augment query with context
  • Generate response

A solution to limited context windows

Benefits of RAG: accuracy

Benefits of RAG: faithfulness

Benefits of RAG: recency

Benefits of RAG: provenance

How do we RAG?

  • Vector search
  • Keyword search
  • Structured queries

LlamaHub

Ingestion pipeline

Supported embedding models

  • OpenAI 
  • Langchain 
  • CohereAI 
  • Qdrant FastEmbed 
  • Gradient 
  • Azure OpenAI
  • Elasticsearch 
  • Clarifai
  • LLMRails 
  • Google PaLM 
  • Jina 
  • Voyage 

...plus everything on Hugging Face!

Supported Vector databases

  • Apache Cassandra
  • Astra DB
  • Azure Cognitive Search
  • Azure CosmosDB
  • ChatGPT Retrieval Plugin
  • Chroma
  • DashVector
  • Deeplake
  • DocArray
  • DynamoDB
  • Elasticsearch
  • FAISS
  • LanceDB
  • Lantern
  • Metal
  • MongoDB Atlas
  • MyScale
  • Milvus / Zilliz
  • Neo4jVector
  • OpenSearch
  • Pinecone
  • Postgres
  • pgvecto.rs
  • Qdrant
  • Redis
  • Rockset
  • SingleStore
  • Supabase
  • Tair
  • TencentVectorDB
  • Timescale
  • Typesense
  • Weaviate

Retrieval

Retrieval:

metadata filtering

Retrieval:

hybrid search

Retrieval:

text-to-SQL

Retrieval:

text-to-Pandas

Postprocessing

Agentic strategies

SubQuestionQueryEngine

Multi-document agents

Recursive retrieval

Composability

Supported LLMs

That's a lot of stuff!

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

Let's do it in 5 lines of code:

npx create-llama

Create Llama Templates

SECinsights.ai

LlamaBot

LlamaHub (again)

"2024 is the year of LlamaIndex in production"

– Shawn "swyx" Wang, Latent.Space podcast

LlamaIndex in production

  • Datastax
  • OpenBB
  • Springworks
  • Gunderson Dettmer
  • Jasper
  • Replit

 

  • Red Hat
  • Clearbit
  • Berkeley
  • W&B
  • Instabase

Case study:

Gunderson Dettmer

What's coming for LlamaIndex in 2024?

What's coming for the industry in 2024?

People

Recap

  • What RAG is
  • Why we do it
  • How we do it
  • The stages of RAG
  • What's coming next

What now?

Follow me on twitter: @seldo

Enterprise RAG with LlamaIndex

By seldo

Enterprise RAG with LlamaIndex

  • 256