SANS
GERARD
Google Developer Expert
Developer Evangelist
International Speaker
Spoken 199 times in 43 countries
Google AI Learning Path
Vertex AI
Complexity
Features
1
2
Google AI Ecosystem
VertexAI
AI Platform
Gemini for Workspace
AI Assistant
AI Studio
AI Playground
Gemini
Foundational
Models
Gemini
Chatbot
Specialised training
Multimodal medical model
Med-Palm
Opening the world
Gemini Ultra benchmarks
Gemini for Open Source: Gemma
Responsible AI
Reduce Biases
Safe
Accountable to people
Designed with Privacy
Scientific Excellence
Follow all Principles
Socially Beneficial
Not for Surveillance
Not Weaponised
Not Unlawful
Not Harmful
Vertex AI
Global
Google Cloud AI Platform
Vertex AI
Complexity
Features
Scaling Generative AI
Foundational Models
Voice
text-speech speech-text
Medical
medlm-medium medlm-large
Code
code-bison codechat-bison code-gecko
Multimodal
gemini-pro gemini-pro-vision
gemini-ultra*
1 Million tokens
Gemini 1.0 Pro 128K tokens
(100 pages)
Gemini 1.5 Pro 1M
(800 pages)
Gemini for Google Workspace
Protection: IP infringements
All Google Services
DuetAI
Generated Outputs
VertexAI
Training Data
Stochastic parrot or AGI?
Training: guessing the next word
Adjust model predictions using output
Wikipedia
Christopher  is
Christopher Columbus was
Input
Output
Christopher Columbus discovered  America
Christopher Columbus discovered America  in
Christopher Columbus discovered America in  1492
Christopher Columbus discovered America in 1492 Â .
Christopher Columbus discovered America in 1492.
Christopher Columbus discovered America in 1492.
Christopher Columbus discovered America in 1492.
Christopher Columbus discovered America in 1492.
Christopher Columbus discovered America in 1492.
Christopher Columbus discovered America in 1492 .
Hyper-dimensional Graph
who
is
America
Columbus
discover
Latent Space
Word Embedding
Word Embedding
A prompt will put you in a certain area of the latent space
Note the density of data points and noise ratios
AI generated text is...
Biased
Non-factual
Inaccurate
1+1= 3
Grounding: reducing hallucinations
VectorDB Embeddings
Google Knowledge Graph
Google Search
Fact Checking
From idea to code
AI Studio
AI Playground
Gemini
API
Gemini
Fine-tuning
Digital art for everyone
Imagen 2: unlocking visual creativity
Generative AI for creatives
magazine style 4k photorealistic,
modern red armchair
natural lighting
Portrait of a french bulldog
at the beach,
85mm f/2.8
Assortment
of delicious,
freshly-baked donuts
Prompt
Imagen 2
Image inpainting and upscaling
Original + mask
Imagen 2
Automatic image captioning
Input
Caption
Explore images via chat with VQA
Input
Question
AI-driven interior design
AI-driven interior design
AI-driven interior design
AI-driven interior design
AI-driven interior design
Global
First steps in Generative AI
Vertex AI
Complexity
Features
Paid access for Gemini Ultra
Imagine
Image Generation
Google Lens
Google Search
Learn
Listen Response
Share
+40 Languages
Be Creative
C++, Go, Java, Javascript, Python and Typescript
Code
Generate
complex graphics
Plot
Gemini extensions!
Access your GMail.
Do More
Deep integration with YouTube.
Save Time
 US-only
Generative AI for Developers
Vertex AI
Complexity
Features
Your sandbox for prompts
GoogleAI
VertexAI
Pro 1.0
Pro 1.5
Ultra 1.0
Nano 1.0
1h
video
11h
audio
30K
LOC
800
pages
1 Million tokens
Foundational Models
Embeddings
models/embedding-001
Gemini Pro
gemini-pro
Gemini Pro Vision
gemini-pro-vision
New multi-modal architecture
Computer Vision tasks
Source: V7 Labs
Visual Training Datasets
A dog running
Digits 0-9
Ant
French cat
MNIST
60K
10 classes
COCO
330K
80 classes
ImageNet
14MÂ
Image + caption
LAION (Web)
5B
Image + text
Granular Visual Patches (ViT)
Query Patch
Detail
Image
Visual Attention Mechanism
"ginger fur"
"standing on a stone in the garden"
"well-groomed"
Image
Features
Attention
A new generalist Computer Vision
Visual Chat
VQA
Multi-turn
Reasoning
Extract Data
Handwriting
Data entry
OCR
Metadata
Identify
Recognition
Captioning
Categorising
Structure
Elements
Relationships
Hierarchies
Time/Space
Tracking
Activity
Causality
3D/4D
Computer Vision use-cases
Monday to Friday from 6:30 to 13:00 from 16:30 to 20:00
Extract the text for the opening hours and consolidate them in a single paragraph in English
Prompt
Output
Multimodal example
HORARI DILLUNS A DIVENDRES DE 6'30H. A
Extract Text
OCR
Image Input
Task
Raw Data
13H 16'30H. A 20'00H
Extract Text
Hand-written OCR
Reason
Mash-up fragments
HORARI DILLUNS A DIVENDRES DE 6'30H. A 13H 16'30H. A 20'00H
Translate
Catalan to English
Monday to Friday from 6:30 to 13:00 from 16:30 to 20:00
Vision Examples
Multimodal: better understanding
Breaking language barriers
Advanced OCR: complex layouts
Emerging features: mirrored text
Gemini models landscape
Access to Gemini
Mini-Gemini Chatbot Demo
Building a mini-Gemini Chatbot
By Gerard Sans
Building a mini-Gemini Chatbot
In this talk, you will learn how to build a mini-Gemini Chatbot using Google's latest Generative AI using Google AI Studio, Gemini Pro model and Angular. Google AI Studio is a tool to build the new wave of Generative AI applications using Gemini foundational models. We will be introducing the Gemini models to build the foundations of a Gemini Chatbot and explore advanced features like AI Agents, the ability to use tools and call APIs; RAG, or Retrieval Augmented Generation to improve grounding and extend Gemini training data cut off to include external data and more! Google Gemini era is here.
- 1,045