GenAI Applications
From PoC
to Production
- Hi, i am Johann.
- Old fart, bought the book about agents in 1986
- ai believer through two ai winters
- My playgrounds: https://huggingface.co/mayflowergmbh
https://github.com/mayflower/
- Founder / CTO / Investor / Advisor
- My company: Mayflower GmbH
Phase I
Information & Orientation
Be more productive by using 500 tools!
Phase II
Experiments &
first Invests
How long does it
take to create your first AI Prototype?
One Day?
One Week?
One Month?
It's fast, because somebody else does the actual work.
Domain specific
RAG +
Other APIs
- Python web frontend with 1000 Loc
- LlamaIndex, Langchain, ...
- Remote or local VectorDB
- External APIs
Phase III
Get it into in Production
Production
- Environments & ALM: Dev, Integration, Prod, IaC
- Monitoring, Alerting, Observability
- Authentication, Encryption, Compliance, GDPR
"Hey, it's just a python container."
RAG is easy to prototype and
hard to deliver.
- Are all relevant answers found?
- Are all answers correct?
- Do we import properly?
- Do we embed reliably?
- Constant tweaking & tuning
- Reranking, Query augmentation
- Test Datasets, Regressions
with RAGAS etc - RAFT with high read rates
RAG Reality
LLMOps is neither Ops nor MLOps.
- Prompt management
- Prompt performance
- Prompt Logging & Datasets for
model drift / validation - TokenOps
- LLM/Agent observability
LangSmith, LangFuse - Agent Orchestration/LangGraph
- Traceability
LLMOps
LLM Security
GenAI FinOps:
- TokenOps
- Prompt Compression
- GPT-Caching
- LLM Routing/Martian
- Reranking to shorten RAG
- Contextual Compression
- ...
Small Language Models
Fast & Cheap
Phase IV
Integrated AI
Systems
AI is not copying and pasting to and from
a chat window.
AI is not copying and pasting to and from
a chat window.
- hundreds of integrations do already exist
- Including toolkits for
- Office365
- GoogleDocs
- Apify
- AWS Lambda
- Robocorp
- Atlassian
AI is not the automation of a single small step
within a large process.
Scaling AI
- AI agents don bother to
- work 1000 times in parallel
- do a stupid job 1000 times in a row, in 5 minutes
- 50 or 5.000.000 customer requests a day?
- Did you evaluate 3 or 300 offers?
- Integrate 2 or 20 Shops per months
- Evaluate 10 or all job offers?
Portfolio Level AI
-
Agents as universal problem solvers per business domains
-
Every functionality and data is an API is an agent tool
-
Knowledge becomes a first class citizen
-
Learning loops and long-term memory result in constant adaption
-
Dashboards everywhere: from human time saved to
failed self-critiques
From PoC to Production
By Johann-Peter Hartmann
From PoC to Production
Phases of AI development in startups and companies.
- 86