Voice AI Systems Are Vulnerable to Hidden Audio Attacks

May 18, 2026 · ai security books · Source ↗

TLDR

IEEE S&P paper shows imperceptible adversarial audio clips can hijack large audio-language models with 79-96% success across 13 models.

Technique dubbed AudioHijack embeds inaudible malicious instructions in audio clips; attacks are context-agnostic and reusable against the same model.
Tested against 13 open models plus commercial services from Microsoft and Mistral; demonstrated web searches, file downloads, and email exfiltration.
Training an attack signal takes ~30 minutes; once trained it works regardless of what the legitimate user says alongside the audio.
Real-world vectors include poisoned YouTube/music clips, voice notes, and live Zoom audio fed to AI transcription services.
Common defenses failed: few-shot instruction examples reduced success by only 7%, self-reflection caught only 28% of attacks.