About Me
- Frontend Intern @ Gametize
- Founder @ Computer Vision + Augmented Reality startup
- AI Research @ Handshakes
- Interests: Making small-ish models do cool things
Agenda
- What can AI do?
- How to use AI?
- How to use AI better?
- How to keep up with AI trends?
What can AI do?
Text
- Make decisions on text (BERT, T5, etc)
- Classification
- Generate text (LLMs)
- Chat
- Content Generation
- Natural Language Interface
- Agents (with tools access)
Classification
- Positive/Negative Sentiment for feedback
- "I really enjoyed this activity" -> positive
- Grading open ended quiz answers
- "What is phishing?"
- Some reasonable answer -> correct
- Some wrong/nonsense answer -> wrong
- "What is phishing?"
- Classify users into groups
- Personality/Skill levels
Chat
- Make the entire experience chat
- Chat adventure game with quiz/challenges
- Personalized learning
- Focus on difficult concepts
- Answer questions
- Help users clarify concepts
- Point to the right documentation
Content Generation
- Generate reading material
- Generate question + answer pairs from material
- Dynamic reading material + challenges/quizes
- adapt to skill/interests
- Provide real time feedback on answers
Natural Langauge Interface
- Convert natural language query to SQL/DSL
- "Find users with the most number of badges collected in the past X days"
- Convert natural language into programmatic API
- "I want to use the app in dark mode" -> set_darkmode(True)
Agents
- Model chooses the sequence of actions to take
- Planning, reflection, self-critique, tools
- Access to tools (model decides when and how to use)
- Wolfram Alpha
- Database
- ... your internal tool
- Enabling complex, generalized behaviour
- ... this is at the bleeding edge now, libraries for this are not mature
Other Modalities
- Image
- Object Detection, Facial Recognition
- Text + Image -> QA on Images
- Generate Images
- Sound
- Speech to Text
- Text to Speech
- Video
- Treat as Sequence of Images
- Text to Video
Difficulty (for model):
Generating >> Making decisions
Just like speaking a new language is harder than understanding it
What can AI NOT do?
What can AI NOT do?
- It cannot read your mind
- It cannot work with vague instructions
- It cannot figure out why you want to do something
- It cannot replace you/your job
- ... but it will drastically change the way you work
- it will change the economics of what is valuable
How to use AI?
Glossary
- Size of model: Number of parameters
- 100M - 3B - 7B - 175B - 1.8T
- Token: Represent text with numbers
- 1 word ~ 1.3 token
- Context Length: Number of tokens (input + output)
- 512 - 4096 - 32k - 1M
- Able to use doesn't mean able to use well
Choose your model
Proprietary Models
- GPT-4 by OpenAI
- Claude Opus by Anthrophic
- Gemini Pro by Google
Open Models
- Mixtral/Mistral by Mistral
- Llama2 by Meta
- Gemma by Google
- ... community finetunes
Recommendation
- Start with the best models to validate and prototype
- General usecases
- Claude Opus > GPT4 > Gemini Pro
- Long Inputs (RAG)
- Gemini Pro 1.5 (~1M context length)
- Text (700k words), code (30k), sound (11 hrs), video (1hr),
- Not available yet
- ... but I've tested it and it works very well
- Gemini Pro 1.5 (~1M context length)
Retrieval Augmented Generation
- Ground AI outputs with external information
- Dealing with long documents is tricky
- Longer context length models should help
- query -> retrieve relevant context -> input: query + context
- LlamaIndex is a good place to start
Proprietary Models
- Easy to get started
- Higher capabilities (but pricier)
- Questionable privacy practices
- Might be too "woke"
- No control over deprecation schedules
Open Models
- More complex to use, train, deploy
- Use smaller models -> lower capabilities but faster/cheaper
- Finetune for your usecase
- More expensive at low volumes (pay by GPU/h, not API call)
- Stable
Choose your model
Newer models might not always be better (for you)
- "Is ChatGPT getting worse?"
- ChatGPT updates regularly
- OpenAI API models are stable
- ... as long as they are available
- generally deprecated after 1 year
Newer models might not always be better (for you)
Proprietary Models
- Easy to get started
- Higher capabilities
- Questionable privacy practices
- Might be too "woke"
- No control over deprecation schedules
Open Models
- Finetune for your usecase
- Use smaller models -> might be cheaper
- More complex to use, train, deploy
Choose your model
Open Models as an API
- TogetherAI
- HuggingFace
- Anyscale
- Easy for Developers (just a API call)
- Cheap (pay per use)
- Stable
- Can easily change providers
Choose your model
Recommendation
-
No code
- Proprietary models products
-
Prototyping/Need high capabilities (reasoning/planning etc)
- Claude Opus, GPT4, Gemini Pro
-
Need long context length
- Gemini Pro 1.5 (not available publically yet)
-
Integrate AI into software products
- Open models as a service
-
Train custom AI models & high volume
- Finetune open models
How to use AI better?
aka Prompt Engineering
Glossary
- Prompt: Input to model
- System Prompt: Meta instructions on how to respond
- User Prompt: User input to the model
- Assistant Prompt: Model's previous output (useful for conversation style)
- Prompt Template: Special way to arrange the above prompts in a way that is understandable by the model
- Check your model, but taken care of in APIs
Prompts
- System Prompt: You are a helpful assistant
- User Prompt: Hi!
- Assistant Prompt: Hello, how many I help you?
- User Prompt: Can you tell me the temperature in Singapore
- Assistant Prompt: The average temperature in Singapore is 25°C to 31°C
- User Prompt: What about rainfall?
- Assistant Prompt: The average annual rainfall in Singapore is 2340 mm.
Prompts
- System Prompt: Only reply in emojis
- User Prompt: Hi!
- Assistant Prompt: 👋
- User Prompt: I want to know the recipe to make cookies
- Assistant Prompt: 🧈🥚🥄🍚🧂🍫🔪🥄🧑🍳👩🍳🤲🧈📦🕒🌡️🔥🆒🍪
How to Prompt
- Setup model to not fail
- How you would give instructions to an intern
- Mimic how a human will complete the task
- Create structure, show examples
Teach a bot to fish
aka Few shot Learning
- Show examples in the prompt of how to answer
System Prompt: You are a helpful assistant
Human Prompt: Classify if this sentence is positive/negative: "I really enjoyed this activity"
Assistant Prompt: Positive
Human Prompt: Classify if this sentence is positive/negative: "This was a great waste of time"
Assistant Prompt: Negative
Human Prompt: Classify if this sentence is positive/negative: "I learnt many new things today"
Assistant Prompt:
Tell the model who to be
- System Prompt: `You are a world class programmer`
- System Prompt: `Assume that I have Javascript experience but no Python experience`
Create an Internal Monologue
"Lets think step by step before answering."
- Summarise this story into the key plot points.
- Outline the key players in the story. Who are the characters?
- List the major plot points are who was involved?
- For each plot point, what were the consequences?
- For each of the consequences, see if any are missing from the plot points, and list them
- Resummarise the story using the plot points
Ask the model to prompt itself
- Works unreasonably well
- Metaprompt by Claude
- Compile LLMs using DSPy
Domain Knowledge >> AI Knowledge
How to keep up with AI trends?
How to keep up with AI trends?
- What I personally use
- Twitter
- Focus on high signal/noise ratio
- My follows are a good place to start
-
Machine Learning SG Meetup
- Mix of latest research + practical tips
- Good community of other researchers/builders
- Youtube
- Twitter
Thank you!
AI For Engagement
By vivekkalyan
AI For Engagement
- 71