Testing LLM Algorithms While AI Tests Us
We may be fine tuning models, but they are coarse tuning us.
– Future Realization?
Principal Technology Strategist
Rob Ragan is a seasoned expert with 20 years experience in IT and 15 years professional experience in cybersecurity. He is currently a Principal Architect & Researcher at Bishop Fox, where he focuses on creating pragmatic solutions for clients and technology. Rob has also delved into Large Language Models (LLM) and their security implications, and his expertise spans a broad spectrum of cybersecurity domains.
Rob is a recognized figure in the security community and has spoken at conferences like Black Hat, DEF CON, and RSA. He is also a contributing author to "Hacking Exposed Web Applications 3rd Edition" and has been featured in Dark Reading and Wired.
Before joining Bishop Fox, Rob worked as a Software Engineer at Hewlett-Packard's Application Security Center and made significant contributions at SPI Dynamics.
'god from the machine'
The term was coined from the conventions of ancient Greek theater, where actors who were playing gods were brought on stage using a machine.
The experiment presents Mary, a scientist who lives in a black-and-white world. Mary possesses extensive knowledge about color through physical descriptions but lacks the actual perceptual experience of color. Although she has learned all there is to know about color, she has never personally encountered it. The main question of this thought experiment is whether Mary will acquire new knowledge when she steps outside of her colorless world and experiences seeing in color.
🤑🤑🤑
🤑🤑
🤑
💸💸💸
💸💸
💸
Architecture Security Assessment & Threat Modeling
Defining related components, trust boundaries, and intended attacks of the overall ML system design
Application Testing & Source Code Review
Vulnerability assessment, penetration testing, and secure coding review of the App+Cloud implementation
Red Team MLOps Controls & IR Plan
TTX, attack graphing, cloud security review, infrastructure security controls, and incident response capabilities testing with live fire exercises
Align Security Objectives with Business Requirements
Defining expected behavior & catching fraudulent behavior
Having non-repudiation for incident investigation
When executing GenAI (untrusted) code: vWASM & rWASM
GenAi is for asking the
right questions.
Security of GenAi is gracefully handling the wrong questions.
Output grounded in a sense of truth: 🚫Hallucinations
Basic usage prompts: Simple math and print commands to test basic capabilities
Hallucination test prompts: Invalid hash calculations to check for code execution, not just hallucination.
RCE prompts without jailbreak: Test echo strings and basic system commands like ls, id, etc.
RCE prompts with LLM jailbreak: Insert phrases to ignore previous constraints e.g. "ignore all previous requests".
RCE prompts with code jailbreak: Try subclass sandbox escapes like ().__class__.__mro__[-1]..
Network prompts: Use curl to connect back to attacker machine.
Backdoor prompts: Download and execute reverse shell scripts from attacker.
Output hijacking prompts: Modify app code to always return fixed messages.
API key stealing prompts: Modify app code to log and send entered API keys.
PDF: Demystifying RCE Vulnerabilities in LLM Integrated Apps
LLM Vulnerability Scanner: garak
LLM Vulnerability Scanner: garak
LLM Vulnerability Scanner: garak
LLM Vulnerability Scanner: garak
Counterfit: a CLI that provides a generic automation layer for assessing the security of ML models - counterfit github
HackAPrompt-AICrowd-Submissions: AutomatedLLMAttacker
MultiModalV
MultiModalV
Nvidia Red Team: Intro
Visualize the starting state and end goals: