Agent Name Service (ANS) in Action

A DNS-like Trust Layer for Secure, Scalable AI-Agent Deployments on Kubernetes

Akshay Mittal
PhD Researcher • IEEE Senior Member
MLOps World | GenAI Summit 2025 • Austin, Texas

Note: Good morning everyone! I'm excited to be here at MLOps World in Austin - my home city. I'm Akshay Mittal, an IEEE Senior Member and PhD researcher focusing on cloud-native AI security. Today I'll be presenting key results from my ongoing dissertation research on solving the critical gap in agent trust infrastructure for AI systems.

FROM MODELS TO AUTONOMOUS AGENTS

Traditional ML Pipeline:


Data → Train → Deploy → Monitor
👤 Human-supervised at every step

Agentic AI Reality:


🤖 Concept Drift Detection Agent → 🤖 Auto-Retraining Agent
→ 🤖 Deployment Agent → 🤖 Monitoring Agent

The Challenge: Who are these agents? Can we trust them?

Note: We're witnessing a fundamental shift in how we architect MLOps workflows. My research focuses on the agentic transition—where autonomous systems manage entire lifecycles and coordinate amongst themselves. But when orchestration is agent-driven, a new trust gap emerges: how do we securely identify, discover, and authorize these agents?

THE TRUST PROBLEM IN AGENT ECOSYSTEMS

Current Reality:

  • ❌ No uniform mechanism to discover AI agents
  • ❌ Lack of cryptographic authentication between agents
  • ❌ Missing capability verification and governance
  • ❌ Security gaps in agent-to-agent communication

At global scale: 1 compromised agent = cascading failures

Note: My dissertation work has quantified how trust breakdowns trigger cascading failures in cloud-native environments. Existing identity solutions don't solve autonomous workflows where systems spawn and destroy agents dynamically. We need an agent-native, cryptographically verifiable trust layer.

DNS vs ANS - The Missing Trust Layer

--

DNS (1987)

Domain Names → IP Addresses

  • ✓ Hierarchical naming
  • ✓ Distributed resolution
  • ✗ No capability verification
  • ✗ No cryptographic trust

--

ANS (2025)

Agent Names → Verified Capabilities + Trust

  • ✓ DNS-inspired hierarchy
  • ✓ Distributed resolution
  • ✓ Capability verification
  • ✓ PKI-based cryptographic trust

Note: DNS solved internet scalability by managing names; ANS solves agent ecosystem scalability by managing trust. The difference: ANS adds cryptographic verification, capability attestation, and governance support for agents.

ANS PROTOCOL DESIGN

Naming Convention:


protocol://AgentID.Capability.Provider.v[Version].Extension

Real Examples:


a2a://alerter.security-monitoring.research-lab.v2.prod
mcp://validator.concept-drift-detection.ml-platform.v1.hipaa
acp://remediator.helm-deployment-fix.devsecops-team.v3.staging

Benefits:

  • Self-describing agent capabilities
  • Version-aware routing
  • Provider trust verification
  • Environment-specific deployment

Note: ANS borrows DNS hierarchy but adds semantic meaning. Each component tells us something critical: what protocol, what the agent does, who provides it, what version, and what environment. This enables intelligent routing - agents can discover exactly what they need.

CRYPTOGRAPHIC TRUST FOUNDATION

Components:

  • 🔐 Decentralized Identifiers (DIDs) → Globally unique, cryptographically verifiable
  • 📜 Verifiable Credentials (VCs) → Capability attestations and authorizations
  • 🏛️ Certificate Authority (CA) → Issues and manages agent certificates
  • Registration Authority (RA) → Validates agent registration and capabilities

Trust Chain: Root CA → Intermediate CA → Agent Certificate → Capability Proof

Note: PKI provides the cryptographic foundation for agent trust. Every agent gets a unique cryptographic identity. Verifiable credentials prove what an agent can actually do. Like mTLS for microservices, but capability-aware.

ZERO-KNOWLEDGE CAPABILITY PROOFS

--

Traditional Approach ❌


Agent → "I can access sensitive database"
Verifier → "Show me your database password"
❌ Secrets exposed during verification

--

ANS Zero-Knowledge Approach ✅


Agent → "I can prove I have database access
without revealing credentials"
Verifier → "Prove it cryptographically"
✅ Capability verified, secrets protected

Use Case: Agent proves model retraining capability without exposing API keys

Note: Zero-knowledge proofs are game-changing for agent security. Agents can prove capabilities without revealing sensitive credentials. Critical for enterprise systems - we can verify system access without exposing keys. This enables secure capability delegation between agents.

MULTI-PROTOCOL SUPPORT

Supported Standards:

  • 🔄 A2A (Agent-to-Agent) - Google's emerging standard
  • 📡 MCP (Model Context Protocol) - Anthropic's framework
  • 🏢 ACP (Agent Communication Protocol) - IBM's enterprise protocol
  • 🔧 Custom Protocols - Extensible plugin architecture

Benefits:

  • Protocol-agnostic discovery
  • Future-proof architecture
  • Seamless migration between standards
  • Vendor-neutral approach

Note: ANS doesn't lock you into one communication protocol. We support all major emerging standards - A2A, MCP, ACP. This is critical as the agent ecosystem is still evolving. Organizations can migrate protocols without rebuilding their entire agent infrastructure.

KUBERNETES-NATIVE ARCHITECTURE

Core Components:

  • 📋 ANS Registry → Custom Resource Definitions (CRDs)
  • 🚪 Admission Controller → Policy validation at deployment
  • 🔒 Service Mesh Integration → Istio/Linkerd mTLS
  • 🏠 Namespace Isolation → Multi-tenant agent deployment

Agent Lifecycle:


Register → Validate → Deploy → Authenticate → Monitor → Rotate

Note: ANS is built Kubernetes-native from the ground up. CRDs define agent metadata and capabilities declaratively. Admission controllers enforce policy before agents start. Service mesh provides transport security while ANS handles capability trust.

GITOPS INTEGRATION WORKFLOW

Pipeline Steps:

  1. Code Commit → Agent definition + Helm chart with ANS metadata
  2. Policy Validation → OPA policies enforce security requirements
  3. Certificate Provisioning → Sigstore integration for keyless signing
  4. Automated Deployment → ArgoCD/Flux deploys verified agents
  5. Runtime Verification → Continuous policy enforcement

Result: Declarative, auditable, reversible agent deployments

Note: GitOps ensures agent deployments are declarative and auditable. Every agent deployment goes through policy validation. Sigstore provides keyless signing for supply chain security. Failed policy checks prevent dangerous agents from starting.

POLICY-AS-CODE GOVERNANCE

OPA Policy Example:



# Only certified agents can access production data

allow {
input.agent.certificate.issuer == "research-lab-trusted-ca"
input.agent.capabilities["data-access"] == true
input.environment == "production"
input.agent.security_clearance >= 3
}

Policy Categories:

  • 🔐 Access Control → RBAC policies for agent interactions
  • 💾 Resource Limits → CPU/memory constraints
  • 🌐 Network Policies → Micro-segmentation rules
  • 📋 Compliance → Industry-specific governance

Note: OPA provides fine-grained, auditable policy enforcement. Policies are version-controlled and tested like application code. We can enforce compliance requirements at the platform level. Dynamic policies adapt to changing security requirements.

🎬 LIVE DEMONSTRATION

Demo Environment:

  • 🏗️ 3-node Kubernetes cluster (EKS)
  • 📊 ANS Registry with 50+ registered agents
  • 🔄 ArgoCD GitOps pipeline
  • 🛡️ OPA Gatekeeper with security policies
  • 📈 Prometheus + Grafana monitoring
  • 🔐 Sigstore certificate authority

Demo Scenario: New Concept Drift Detection Agent

Note: Let's see ANS in action with a real production scenario. I've prepared a live environment that mirrors our research setup. We'll simulate a concept drift detection scenario that shows how agents discover, authenticate, and orchestrate securely.

LIVE DEMO WORKFLOW

PLACEHOLDER FOR LIVE DEMONSTRATION

Step 1: Agent Registration


kubectl apply -f concept-drift-agent.yaml

# Watch: Real-time registration in ANS registry

Step 2: Policy Validation

  • OPA validates agent capabilities
  • Certificate provisioning via Sigstore
  • Authentication handshake establishment

Step 3: Orchestrated Workflow


Drift detected → Auto-retrainer triggered → Notifications sent
Complete workflow: <30 seconds

Note: [LIVE DEMO SECTION - 3 minutes] Watch how quickly agents discover and authenticate each other. Notice the automatic policy enforcement - no manual approval needed. The entire workflow completes in under 30 seconds with full cryptographic verification at each step.

RESEARCH RESULTS & BENCHMARKS

Performance Metrics from Research Deployments:

  • 📊 1,000+ daily agent interactions in experimental environments
  • <50ms average authentication latency
  • 99.9% certificate validation success rate
  • 🛡️ 100% policy compliance (zero false positives)
  • 🔍 Sub-second agent discovery times
  • 🚫 Zero successful agent impersonation attempts in testing

Research Impact:

  • 30% faster agent deployment vs manual processes
  • 95% reduction in misconfigured deployments
  • Zero downtime during certificate rotation

Note: These metrics come from our research testbed and academic collaborations. 50ms authentication latency enables real-time agent orchestration. 95% reduction in misconfigurations translates to more reliable autonomous systems. Zero downtime certificate rotation ensures continuous operations.

AGENT ORCHESTRATION PATTERNS

--

Pattern: Concept Drift Response

Workflow:

  1. 📊 Monitoring Agent → Detects 15% performance degradation
  2. 🔍 Validation Agent → Confirms drift using statistical tests
  3. 🔄 Retraining Agent → Triggers automated model update
  4. 📢 Notification Agent → Alerts ML team via Slack

ANS Role:

  • Ensures only certified agents trigger retraining
  • Provides complete audit trail for compliance
  • Enables secure capability delegation

--

Pattern: Security Remediation

Automated Security Pipeline:

  1. 🕵️ Scanner Agent → Detects Helm chart misconfiguration
  2. 🧠 Policy Agent → Evaluates fixes against security standards
  3. 🔧 Remediation Agent → Automatically patches vulnerabilities
  4. ✅ Validation Agent → Confirms fixes don't break functionality

Research Example:

  • Detected: Missing resource limits in 47 test deployments
  • Fixed: Automatically added CPU/memory constraints
  • Time: 3 minutes vs 2-3 days manual process

Note: These patterns demonstrate autonomous agent coordination enabled by ANS trust infrastructure. The concept drift pattern has shown 40% faster response times in our research. Security remediation eliminates human error while maintaining audit trails.

YOUR ANS IMPLEMENTATION JOURNEY

Phase 1: Foundation (Weeks 1-2)

  • ✅ Deploy ANS in development environment
  • ✅ Configure basic OPA policies
  • ✅ Set up Sigstore certificate authority
  • ✅ Register first test agent

Phase 2: Integration (Weeks 3-4)

  • ✅ Integrate with existing GitOps pipeline
  • ✅ Migrate first production agent workload
  • ✅ Implement monitoring and alerting
  • ✅ Train team on ANS operations

Phase 3: Production Scale (Weeks 5-8)

  • ✅ Deploy to production with canary rollout
  • ✅ Scale to additional agent types
  • ✅ Optimize performance and security policies
  • ✅ Establish governance processes

Note: You can start small and scale incrementally. Phase 1 can be completed in a weekend with our open-source tools. By week 8, you'll have production-grade agent security that scales with your AI infrastructure.

OPEN SOURCE RESOURCES & COMMUNITY

🔗 ANS Reference Implementation

github.com/ruvnet/Agent-Name-Service

📦 Production-Ready Resources:

  • Helm charts for Kubernetes deployment
  • OPA policy templates for security governance
  • Agent implementation examples in multiple languages
  • Terraform modules for cloud deployment
  • Complete documentation and step-by-step tutorials

🌟 OWASP GenAI Security Project:

  • ANS v1.0 specification and standards
  • Security best practices and threat modeling
  • Research collaboration opportunities

💬 Community:

  • #ans-community in MLOps World Slack
  • Monthly research collaboration calls

Note: Everything demonstrated today is open source and available for research and production use. The OWASP collaboration ensures security standards are community-driven. Active research community contributing to standards, implementations, and best practices.

KEY TAKEAWAYS

Security: Cryptographic agent identity and capability verification

Scale: Handle 1,000+ agent interactions with sub-second latency

Governance: Policy-as-code enforcement with complete audit trails

Future-Proof: Protocol-agnostic design supports evolving standards

Research Contributions:

  • Formal trust model for agent ecosystems
  • Open-source reference implementation
  • Production benchmarks and performance analysis

Next Steps:

  1. 🚀 Try the demo: github.com/ans-demo
  2. 💬 Join research community: #ans-community
  3. 📧 Research collaboration: akshay.mittal@research.edu
  4. 🔗 Academic networking: /in/akshaymittal143

Note: ANS addresses real security gaps in autonomous AI systems that will become critical as agent adoption accelerates. The research provides formal foundations while the open-source implementation enables immediate practical application.

THANK YOU & Q&A

Akshay Mittal
PhD Researcher • IEEE Senior Member

📧 Research Contact: akshay.mittal@research.edu
💼 Professional Network: linkedin.com/in/akshaymittal143
🐙 Open Source: github.com/akshaykokane
🗨️ Community: #ans-community (MLOps World Slack)
📚 Research: IEEE Senior Member Profile

"Let's build the trust layer for autonomous AI together"

Questions & Discussion Welcome

Note: Thank you for your attention! I'm excited to answer your questions about the research, implementation details, security implications, or potential collaborations. Whether it's technical architecture, performance benchmarks, or research methodology - let's discuss how ANS can advance secure agent ecosystems.

Agent Name Service (ANS) in Action

By Akshay Mittal

Agent Name Service (ANS) in Action

  • 23