Voice AI Systems Are Vulnerable to Hidden Audio Attacks

· ai security books · Source ↗

TLDR

  • IEEE S&P paper shows imperceptible adversarial audio clips can hijack large audio-language models with 79-96% success across 13 models.

Key Takeaways

  • Technique dubbed AudioHijack embeds inaudible malicious instructions in audio clips; attacks are context-agnostic and reusable against the same model.
  • Tested against 13 open models plus commercial services from Microsoft and Mistral; demonstrated web searches, file downloads, and email exfiltration.
  • Training an attack signal takes ~30 minutes; once trained it works regardless of what the legitimate user says alongside the audio.
  • Real-world vectors include poisoned YouTube/music clips, voice notes, and live Zoom audio fed to AI transcription services.
  • Common defenses failed: few-shot instruction examples reduced success by only 7%, self-reflection caught only 28% of attacks.

Hacker News Comment Review

  • No substantive HN discussion yet.

Original | Discuss on HN