Author lab-reproduced CVE-2026-31431 (Copy Fail) inside rootless Podman, traced it with strace and eBPF, and confirmed user namespace UID mapping blocks host privilege escalation.
Key Takeaways
The exploit abuses AF_ALG (authencesn cipher) to write 4 bytes at a time into the kernel page cache of /usr/bin/su, bypassing file permissions entirely.
The embedded payload is a golfed ELF (section headers stripped), not raw shellcode; it calls setuid(0) then execve("/bin/sh") with a clean exit(0) fallback.
Rootless Podman user namespaces map container UID 0 to host UID 1000; setuid(0) succeeds inside the namespace but carries no host privilege.
Kernel’s secureexec transition hides the setuid(0) call from strace when ptrace is attached to a SUID binary; bpftrace on the host tracepoint captures it cleanly.
Vulnerable kernel range: 6.17.x entirely; fix backported into stable starting at 6.19.12.
Hacker News Comment Review
The page-cache write primitive itself still works inside rootless containers; only the specific exploit payload is neutered. Any fd pointing to a shared resource could be a new attack vector.
Shared page cache across containers using the same base image layers is a real risk: a malicious CI job could corrupt a cached binary affecting sibling containers, even without host root.
Commenters debated whether blocking AF_ALG in the default seccomp profile is practical, with pushback that blanket socket blocking is too broad a policy for general-purpose containers.
Notable Comments
@amluto: “The write-to-RO-page-cache primitive STILL WORKED” – flags that container containment is exploit-specific, not primitive-specific.
@netheril96: Notes Copy Fail can corrupt /etc/ssl/certs for MitM across containers sharing an image, bypassing both rootless and CapabilityBoundingSet mitigations.