Gerard Sans | Axiom 🇬🇧 PRO
Founder of Axiom Masterclass, professional trainings // Forging skills for the new era of AI. GDE in AI, Cloud & Angular. Building London's tech & art nexus @nextai_london. Speaker | MC | Trainer.
In this immersive workshop, you’ll learn how to craft high-quality faceless video content using Google’s latest multimodal tools. You’ll move from concept and prompt iteration, through image and layout design, to animated video generation and voice synthesis via Gemini, Imagen, and Veo. Along the way, you’ll see how to layer in intelligence, like tool integration, retrieval-augmented generation (RAG), and API calls, to make your short reels context-aware and dynamic, ready for real publishing.
Generative AI has shifted gears. In this talk we explore how Gemini 3 Pro now supports multimodal generation (text, image, audio, video) and real-time agents. You’ll see how Google AI Studio connects the latest model families into a unified workflow of “vibe coding” rather than writing boilerplate. We’ll also dive into the voice-first capabilities made possible via the Gemini Live API: real-time, bidirectional voice and video conversations, tone-aware responses, tool-integration and session memory. Together we’ll look at prompt-to-code flows, media generation, voice-first use cases and the next generation of MCP-driven agentic AI.
In this immersive workshop, you’ll learn how to craft high-quality faceless video content using Google’s latest multimodal tools. You’ll move from concept and prompt iteration, through image and layout design, to animated video generation and voice synthesis via Gemini, Imagen, and Veo. Along the way, you’ll see how to layer in intelligence, like tool integration, retrieval-augmented generation (RAG), and API calls, to make your short reels context-aware and dynamic, ready for real publishing.
Generative AI has shifted gears. In this talk we explore how Google AI Studio now supports multimodal generation (text, image, audio, video) and real-time agents. You’ll see how Studio connects the latest model families (such as Gemini 2.5 and the expected Gemini 3) into a unified workflow of “vibe coding” rather than writing boilerplate. We’ll also dive into the voice-first capabilities made possible via the Gemini Live API: real-time, bidirectional voice and video conversations, tone-aware responses, tool-integration and session memory. Together we’ll look at prompt-to-code flows, media generation, voice-first use cases and the next generation of MCP-driven agentic AI.
The launch of Alexa+ has sparked renewed excitement around the next generation of AI voice assistants powered by generative AI. With Gemini 2.5 and the new Gemini Live API together with the power of MCP, developers now have the tools to build voice-driven AI agents that seamlessly integrate into web applications, backend services, and third-party APIs.In this talk we will go beyond simple chatbot interactions to explore how AI agents can power real-world automation—in this case, running an entire robot cafe. We’ll walk through building a voice-first assistant capable of executing complex workflows using MCP, streaming real-time audio, querying databases, and interacting with external services. This marks a shift from "ask and respond" to a more dynamic "talk, show, and act" experience. You might assume taking a coffee order is straightforward, but even a basic interaction involves more than 15 distinct states. These include greeting the customer, handling the order flow, confirming selections, applying offer codes, managing exceptions, and supporting cancellations or changes. Behind the scenes, the AI agent using MCP coordinates with multiple systems to fetch menu data, validate inputs, and trigger robotic actions. You’ll learn how to stream microphone data, integrate with Gemini voice responses, and use the GenAI SDK to connect everything together using MCP. Instead of a traditional chat UI, this project creates a fully voice-automated, hands-free experience where the assistant doesn’t just chat—it runs the operation. Join us for a deep dive into the future of AI automation using MCP — where natural voice is the interface, and the AI agent takes care of the rest, including your fancy choice of coffee!
The launch of Alexa+ has sparked renewed excitement around the next generation of AI voice assistants powered by generative AI. With Gemini 2.0 and the new Gemini Live API, developers now have the tools to build voice-driven AI agents that seamlessly integrate into web applications, backend services, and third-party APIs. In this workshop we will go beyond simple chatbot interactions to explore how AI agents can power real-world automation—in this case, running an entire robot cafe. We’ll walk through building a voice-first assistant capable of executing complex workflows, streaming real-time audio, querying databases, and interacting with external services. This marks a shift from "ask and respond" to a more dynamic "talk, show, and act" experience. You might assume taking a coffee order is straightforward, but even a basic interaction involves more than 15 distinct states. These include greeting the customer, handling the order flow, confirming selections, applying offer codes, managing exceptions, and supporting cancellations or changes. Behind the scenes, the AI agent coordinates with multiple systems to fetch menu data, validate inputs, and trigger robotic actions. On the frontend, we’ll build an Angular client from scratch that handles real-time audio input and output. You’ll learn how to stream microphone data, integrate with Gemini voice responses, and use the GenAI SDK to connect everything together. We’ll also cover how to tap into core Gemini capabilities like code execution, grounding search, and function calling—enabling a true AI agent that can manage workflows, make decisions, and take action in real time. Instead of a traditional chat UI, this project creates a fully voice-automated, hands-free experience where the assistant doesn’t just chat—it runs the operation. Join us for a deep dive into the future of AI automation — where natural voice is the interface, and the AI agent takes care of the rest, including your fancy choice of coffee!
During this session, we will introduce you to the latest Gemini AI models, including Gemini 2.5 Pro for advanced reasoning and decision-making, Gemini 2.0 Flash for image generation, and Gemini 2.0 Flash Live for real-time, voice-driven AI agents. These models highlight the power of multimodal AI capable of understanding and generating text, audio, images, and video, taking advantage of massive context for maximum performance. You’ll also see how to integrate tools like Google Search, Retrieval-Augmented Generation (RAG), function calling, and external APIs to build powerful, context-aware applications.
In this session, you will learn about Google's Generative AI tools and ecosystem including: Google AI Studio and Gemini chatbot. Google AI Studio is a tool for Developers to build the new wave of Generative AI applications using Gemini foundational models. We will cover a general overview, run some demos using the just released Gemini 1.5 Pro model and answer your questions around the future of AI!
In this talk, get an exclusive first look at Google's groundbreaking Gemini 1.5 in action! This multimodal language model features a Mixture of Experts (MoE) architecture and a revolutionary 1 million token context window, allowing it to understand complex inputs with exceptional depth. We'll explore live demos showcasing how this translates to vastly improved AI assistance for users and its impact on RAG systems.
In this talk, you will learn how to build a mini-Gemini Chatbot using Google's latest Generative AI using Google AI Studio, Gemini Pro model and Angular. Google AI Studio is a tool to build the new wave of Generative AI applications using Gemini foundational models. We will be introducing the Gemini models to build the foundations of a Gemini Chatbot and explore advanced features like AI Agents, the ability to use tools and call APIs; RAG, or Retrieval Augmented Generation to improve grounding and extend Gemini training data cut off to include external data and more! Google Gemini era is here.
What if I told you that in just the last 2 years, AIs have learned to write in Shakespeare's style, to write code and even refactor it, and generate any images on the spot? A new generation of AIs is here to change the way we do things. Learn to code with chatGPT by describing the solution and building it step by step.
In this workshop we will create a Fullstack GraphQL app starting from the Server and building our way up to the client! We will cover everything you need to successfully adopt GraphQL across your stack from client to backend including tooling and best practices. You will learn how to build and design a GraphQL Server starting by defining the GraphQL Schema using types and relations. Moving to the client side, we will create a simple client to demonstrate common usage. As we implement the different features we will introduce GraphQL query syntax including queries, mutations, alias, fragments and directives. At this point we will review how client and server communicate, what tooling is available to track usage and improve performance and how to add authorisation and authentication. Finally we will focus on designing real-time features and sharing best practices to improve performance and leverage scalability.
This is a talk on the openAI project and how close they are to creating powerful artificial general intelligence. Elon Musk said that "artificial general intelligence is likely to overtake humans in the next five years.". We'll learn about their latest iteration of their powerful language model GPT-3 and some of its incredible applications including text and code generation using their invite-only APIs. GPT3 can generate text in English language but also generate source code in JavaScript, Python or even SQL from natural language!
I'm here to tell you about Web3 and its potential in the future. We're all creators at heart, aren't we? And the prospect of doing away with intermediary corporations is a very exciting one. I also want to talk about Web3 London, a meetup group I just created where we explore the Web3 universe, bringing people together to share their experiences.
In this talk, we’ll explore a secret API design pattern: GraphQL directives and how to use them without breaking a sweat. Best of all, you will be able to add new features to your schema without having to change your existing code. A fantastic technique to add to your GraphQL toolset!
In this talk I will present the state of the art for web3, decentralized architectures and tools. You will learn how simple it is to create your own decentralised app using your current coding skills and JavaScript! Ethereum is also the platform that runs smart contracts: applications that run exactly as programmed without any possibility of downtime, censorship, fraud or third party interference. Join to learn all the potential for Ethereum blockchain!
In this talk, we’ll explore a secret API design pattern: GraphQL directives and how to use them without breaking a sweat. Best of all, you will be able to add new features to your schema without having to change your existing code. A fantastic technique to add to your GraphQL toolset!