Machine Learning to Agentic AI

A Journey in the Transformation of Technology

From datasets and feature engineering to context-based embeddings in the Transformer

Slide 1: The Arc of AI — Why This Journey Matters

  • AI has evolved from rule-based systemsstatistical MLdeep learningfoundation modelsagentic systems
  • Each leap was not just technical — it changed what problems we could solve
  • Today, AI can reason, plan, use tools, and act autonomously
  • This presentation traces that transformation, step by step

Slide 2: The Machine Learning Era (2010–2015)

The Foundation

  • ML was dominated by supervised learning with hand-crafted features
  • Key workflow: collect data → engineer features → train model → evaluate

What Was Possible

Task Approach
Spam detection Naive Bayes, SVMs
Image classification HOG features + SVM
Recommendation Matrix factorization
Speech HMM + GMM

The Bottleneck

Feature engineering was the art — and the limitation. A model was only as good as the human intuition behind its features.

Slide 3: The Deep Learning Revolution (2012–2016)

The Turning Point: AlexNet (2012)

  • Won ImageNet competition with 26% → 15.3% error rate — a stunning leap
  • Used Convolutional Neural Networks (CNNs) on GPUs
  • The machine learned its own features — no manual engineering

What Deep Learning Unlocked

  • CNNs → Image recognition, object detection, medical imaging
  • RNNs / LSTMs → Sequential data, language modeling, time series
  • GANs → Generative image synthesis
  • Word2Vec / GloVe → Semantic word embeddings

The New Paradigm

"Instead of telling the model what to look for, we give it enough data and let it figure it out."

Slide 4: The Limits of RNNs — The Problem Transformers Solved

Why RNNs Struggled with Language

  • Processed tokens sequentially — slow, hard to parallelize
  • Vanishing gradients — long-range dependencies were lost
  • Could not easily model: "The animal didn't cross the street because it was too tired" — what does "it" refer to?

The Core Problem

Input: "The bank by the river was ..."
                  ↑
         What does "bank" mean?
         Context is spread across the sentence.

RNNs forget early context. Language needs global context awareness.

Slide 5: "Attention Is All You Need" — The Transformer (2017)

The Paper That Changed Everything

  • Published by Vaswani et al. at Google Brain, NeurIPS 2017
  • Introduced the Transformer architecture — no recurrence, no convolutions
  • Built entirely on the self-attention mechanism

The Key Insight: Self-Attention

  • Every token can attend to every other token simultaneously
  • The model learns which words are relevant to which — dynamically
"The animal didn't cross the street because it was too tired"
                                              ↑
                              Attention links "it" → "animal"

Why It Was Revolutionary

  • Parallelizable → massive GPU utilization
  • Scalable → more data + more compute = better model
  • Context-aware → understands meaning, not just patterns

Slide 6: How the Transformer Works — A Brief Deep Dive

Architecture Overview

Input Text
    ↓
[Tokenization + Positional Encoding]
    ↓
[Multi-Head Self-Attention Layer] × N
    ↓
[Feed-Forward Layer]
    ↓
[Layer Normalization + Residual Connections]
    ↓
Output Probabilities

Three Key Mechanisms

1. Tokenization + Embeddings

  • Text → tokens → vectors in high-dimensional space
  • Similar meanings cluster together geometrically

2. Self-Attention (Q, K, V)

  • Query (Q): "What am I looking for?"
  • Key (K): "What do I contain?"
  • Value (V): "What do I contribute?"
  • Attention score = softmax(QKᵀ / √d)

3. Positional Encoding

  • Since there's no recurrence, position is injected via sinusoidal signals

Slide 7: Large Language Models (LLMs) — Scale Changes Everything

The Scaling Hypothesis

More parameters + more data + more compute = emergent capabilities

Key Milestones

Year Model Parameters
2018 BERT (Google) 340M
2020 GPT-3 (OpenAI) 175B
2022 ChatGPT — (instruction tuned GPT-3.5)
2023 GPT-4, Claude, Llama 100B–1T+
2024–25 Claude 3.5/4, GPT-4o, Gemini Multimodal, reasoning

What Emerged at Scale

  • In-context learning (few-shot prompting)
  • Chain-of-thought reasoning
  • Code generation, math, logical deduction
  • Instruction following (RLHF)

Slide 8: Retrieval-Augmented Generation (RAG)

The Problem LLMs Have

  • Knowledge is frozen at training time
  • Cannot access private/internal data
  • Prone to hallucination on specific facts

RAG: The Solution

User Query
    ↓
[Embedding Model] → Vector Search → [Knowledge Base / Documents]
    ↓
Relevant Context Retrieved
    ↓
Context + Query → [LLM] → Grounded Answer

Why RAG Matters

  • Combines the reasoning power of LLMs with up-to-date, verifiable data
  • Powers enterprise AI, customer support bots, document Q&A
  • Reduces hallucinations by grounding responses in retrieved facts

Slide 9: Model Context Protocol (MCP)

The Problem

  • AI models are isolated — they can't talk to your tools, databases, or APIs natively
  • Every integration required custom code

MCP: A Standard Interface

  • Introduced by Anthropic (2024) as an open protocol
  • Like USB-C for AI — a universal connector between LLMs and external tools
LLM (Claude, GPT...)
        ↓
   [MCP Client]
        ↓
   [MCP Server]  ←→  [File System / GitHub / Slack / Database / APIs]

What It Enables

  • AI models that can read files, query databases, call APIs
  • Standardized tool definitions — write once, works with any MCP-compatible model
  • Foundation for truly capable agentic systems

Slide 10: The Rise of Agentic AI

From "Answer Bot" to "Action Taker"

Generation Capability
LLM (2020) Generate text responses
LLM + RAG (2022) Answer with retrieved knowledge
LLM + Tools (2023) Call APIs, run code, search web
Agentic AI (2024+) Plan, act, self-correct, collaborate

What Makes AI "Agentic"?

  1. Memory — short-term (context), long-term (vector store)
  2. Tools — ability to take actions (search, code, write files)
  3. Planning — multi-step reasoning toward a goal
  4. Self-evaluation — assess own outputs and retry if needed

"An agent doesn't just answer — it acts."

Slide 11: Types of Agentic AI Patterns

Pattern 1: Prompt Chaining

Task → [LLM Step 1] → Output 1 → [LLM Step 2] → Output 2 → Final Result
  • Sequential pipeline of prompts
  • Each step refines or transforms the previous output
  • Use case: Draft → Edit → Translate → Summarize

Pattern 2: Routing

Input → [Router LLM] → classify → [Specialist Agent A]
                               → [Specialist Agent B]
                               → [Specialist Agent C]
  • A classifier decides which expert agent handles the request
  • Use case: Customer support triage, multi-domain Q&A

Pattern 3: Parallelization

                    ┌→ [Agent A: Research] ─┐
Input → [Splitter] ─┤→ [Agent B: Code]     ├→ [Aggregator] → Result
                    └→ [Agent C: Verify]   ─┘
  • Tasks are decomposed and run concurrently
  • Use case: Market research, multi-source analysis

Pattern 4: Orchestrator-Worker

[Orchestrator Agent]
    ↓ assigns sub-tasks
[Worker 1] [Worker 2] [Worker 3]
    ↓ results
[Orchestrator] → synthesizes → Final Output
  • Central planner delegates to specialized workers
  • Use case: Software development, complex research reports

Pattern 5: Evaluator-Optimizer

[Generator Agent] → Output → [Evaluator Agent]
        ↑                           ↓
        └──── Feedback / Retry ─────┘
                    ↓ (when quality threshold met)
               Final Output
  • Built-in quality control loop
  • Use case: Code review, content quality assurance, test generation

Slide 12: Frameworks for Building Agentic AI

LangChain

  • The most widely adopted agentic AI framework
  • Provides abstractions for: chains, agents, memory, tools, retrievers
  • Large ecosystem of integrations (100+ tools)
  • Best for: Rapid prototyping, RAG pipelines, tool-using agents
from langchain.agents import initialize_agent, Tool
from langchain.chat_models import ChatAnthropic

agent = initialize_agent(
    tools=[search_tool, calculator_tool],
    llm=ChatAnthropic(model="claude-sonnet-4-6"),
    agent_type="zero-shot-react-description"
)
agent.run("Research the latest trends in AI and summarize them")

CrewAI

  • Framework for multi-agent collaboration
  • Models agents as a "crew" with defined roles, goals, and backstories
  • Built on top of LangChain
  • Best for: Complex workflows requiring specialized, collaborating agents
from crewai import Agent, Task, Crew

researcher = Agent(role="Researcher", goal="Find key AI trends")
writer     = Agent(role="Writer",     goal="Write a clear summary")

task = Task(description="Research and write about Agentic AI",
            agent=writer)

crew = Crew(agents=[researcher, writer], tasks=[task])
crew.kickoff()

Other Notable Frameworks

Framework Strength
AutoGen (Microsoft) Multi-agent conversations
LlamaIndex Advanced RAG & data pipelines
Semantic Kernel Enterprise .NET/Python integration
Autogen Studio Visual multi-agent builder

Slide 13: Great Minds Shaping the Agentic AI Era

Andrej Karpathy

  • Former Director of AI at Tesla, co-founder of OpenAI
  • Coined the term "Vibe Coding" — the experience of coding with AI where you describe intent and the model handles implementation
  • Famously said he felt "10% behind" after Code Agents emerged — highlighting how fast the field is moving
  • Creator of nanoGPT, minbpe — education tools for understanding transformers from scratch
  • Advocate for deeply understanding the fundamentals before relying on abstractions

Other Influential Voices

Person Contribution
Geoffrey Hinton Godfather of deep learning, neural nets
Yann LeCun CNNs, Meta Chief AI Scientist
Sam Altman Driving GPT/ChatGPT to mass adoption
Dario Amodei Anthropic CEO, AI safety focus
Ilya Sutskever Co-founder OpenAI, scaling laws
Harrison Chase Creator of LangChain

Slide 14: The Transformation — Then vs. Now

Dimension ML Era (2012) Agentic AI Era (2025)
Input Structured datasets Natural language
Feature Engineering Manual, domain expertise Learned automatically
Model Role Predict a label Reason, plan, and act
Human involvement Every step High-level goal setting
Knowledge Frozen in weights Dynamic via RAG + tools
Output Classification / number Code, reports, decisions
Collaboration Single model Multi-agent systems

Slide 15: Where We Are Heading

The Near Future of Agentic AI

  • Autonomous software engineers — agents that open GitHub issues, write code, run tests, submit PRs
  • Personal AI assistants with persistent memory and long-horizon planning
  • Multi-modal agents that can see, hear, speak, and act
  • Agent-to-agent economies — agents hiring other agents to complete sub-tasks

Key Open Challenges

  1. Reliability — agents still make mistakes and hallucinate
  2. Safety — how do we control autonomous actions?
  3. Evaluation — how do we measure agentic performance?
  4. Cost — multi-agent pipelines can be expensive
  5. Trust — when do we let AI act without human approval?

Slide 16: Summary — The Full Journey

2010: Feature Engineering + SVMs
         ↓
2012: Deep Learning — CNNs (AlexNet)
         ↓
2014: RNNs, LSTMs — Sequential Learning
         ↓
2017: Transformer — "Attention Is All You Need"
         ↓
2020: Large Language Models — GPT-3, BERT
         ↓
2022: Instruction Tuning + RLHF — ChatGPT
         ↓
2023: LLMs + Tools + RAG — Grounded AI
         ↓
2024: MCP + Agentic Patterns — Action-taking AI
         ↓
2025+: Multi-Agent Systems — Collaborative AI

"We went from teaching machines to recognize cats, to building systems that can think, plan, and act in the world."

Thank You

Key Takeaways:

  1. Deep learning eliminated manual feature engineering
  2. Transformers made language a first-class problem for AI
  3. Scale unlocked emergent reasoning capabilities
  4. RAG + MCP + Tools make AI grounded and actionable
  5. Agentic patterns define how AI systems are architected today
  6. Frameworks like LangChain and CrewAI make this accessible

Presentation prepared February 2026

Made with Slides.com