and creating custom chat-bots that can run locally
With local-first applications, you get the speed and responsiveness of a local application, but at the same time, you get many of the desirable features from client/server systems.
Use a "production ready"
application
Make a chat-bot
Fine-tune inputs
to "base" model
Fine-tune model /
Train new model
🎵 Realistically, one could pick and choose elements from each element and model selection/creation could be considered an entirely separate phase of chat-bot creation.
Summary: Get up and running with Llama 3, Mistral, Gemma, and other large language models.
# Install ollama
curl -fsSL https://ollama.com/install.sh | sh
# Start ollama server
ollama serve
# Download and run a model
ollama pull llama3
ollama run llama3
# Chat with model
>>> Tell me a joke
Summary: Distribute and run large language models with a single file.
Summary: Open-source large language models that run locally on your CPU and nearly any GPU
Summary: Browser that lets you install, run, and programmatically control ANY application, automatically.
Summary: Open source alternative to ChatGPT that runs 100% offline on your computer
Summary: Discover, download, and run local LLMs
Summary: The all-in-one AI application - Any LLM, any document, any agent, fully private.
Requires
External
Context
Requires Changes to Model
RAG
Prompt
Engineering
Fine-tuning
Embedding
Model
>>> Tell me a joke
Documents
Large Language
Model
Vector Database
Embeddings
User Prompt
Custom Data
Prompt + Context
Why don't eggs tell jokes?
(wait for it...)
Because they'd crack each other up!
Popular Libraries for RAGs
<question>?
Q: <Question>?
A: <Answer>
Q: <Question>?
A: <Answer>
Q: <Question>?
A:
Popular prompt engineering techniques
VS.
…around half of the improvement in language models over the past four years comes from training them on more data
Will We Run Out of Data? An Analysis of the Limits of Scaling Datasets in Machine Learning