semistructured.ai

Elevator Pitch

  • Small AI Models (LLMS)
  • Running on end user devices (phone/laptop)
  • Enterprise-grade audit and monitoring

The Problem

  • AI is expensive
  • An Nvidia H100 GPU card costs ~$34k
  • Serving a mid-size model takes 8-16x H100
  • For one customer!

Unit Economics of AI

  • Suppose your 16x H100 hardware costs 750k
  • Let's say the hardware has a 2 year service life
  • Define a typical user session as ~10 minutes
  • At 100% utilization over 2 years, that is 1.05m sessions
  • So our baseline is $1.40 for a 10-minute user sessionĀ 
  • If your model is bigger, or sessions are longer, $$$
  • If utilization is lower, $$$
  • OpenAI loses money on their $200/month pro plan
  • My wild guess is that a DeepResearch session is $20

Scaling Laws for AI

  • AI is not getting cheaper every year like transistors
  • Gains are coming from more hardware and more data
  • Algorithmic breakthroughs bend the cost curve down
  • Hardware advances have been very slow
  • SOTA labs focus "PhD level" models at any cost
  • Other research directions focus on lower-cost models

Open-Source Models

  • AI is also an active academic research community
  • Research models are often freely available
  • Meta, DeepSeek, and others have given models away
  • Why?
    • Published research attracts the best talent
    • Free models erode the lead of the big players

Edge Hardware

  • Modern phones and laptops also have GPU chips
  • The lowest-cost iPhone 16e can run small AI models
  • So can most laptops, many Android phones
  • No one has figured out how to use this commercially

deck

By Richard Whaling

deck

  • 36