Zuckerberg 'Personally Authorized and Encouraged' Meta's Copyright Infringement

· ai books policy · Source ↗

TLDR

  • Five publishers and Scott Turow sued Meta and Zuckerberg, alleging Meta torrented 267 TB of pirated material including LibGen to train Llama.

Key Takeaways

  • Lawsuit filed May 5 in SDNY by Hachette, Macmillan, McGraw Hill, Elsevier, and Cengage alongside author Scott Turow as a proposed class action.
  • Meta reportedly discussed a $200M dataset licensing budget in early 2023, then abruptly halted licensing after escalation to Zuckerberg, with one employee noting licensing once would undermine a fair-use defense.
  • Suit alleges Meta stripped copyright management information (CMI) from ingested works to conceal training sources, which goes beyond typical fair-use arguments.
  • Meta’s prior fair-use win (Judge Chhabria, June 2025, Silverman/Diaz case) covered transformative training use, but the Anthropic precedent found pirating the source material itself is separately infringing.
  • Meta did sign licensing deals with African-language publishers and news outlets (Fox News, CNN, USA Today), undermining a “no market exists” fair-use argument.

Hacker News Comment Review

  • Commenters are split on fair use: some see AI training as clearly transformative, others argue deliberately torrenting LibGen and stripping CMI puts Meta outside fair-use protection regardless of the training-use question.
  • The personal liability angle draws significant attention. Naming Zuckerberg directly is seen as a meaningful legal move, though corporate liability would exist either way. The Aaron Swartz comparisons are pointed and recurring.
  • Several builders report first-hand evidence of aggressive Meta scraping behavior, including ASN-level blocks after Meta ignored robots.txt across rotating netblocks.

Notable Comments

  • @jcalvinowens: Blocked Meta’s ASN on his personal cgit server after hundreds of megabytes of logs from Meta rotating netblocks to defeat IP limiting.
  • @ben_w: Cites Anthropic settlement (~$3K per pirated work, $1.5B total on 500K works) as the damages precedent most relevant here.

Original | Discuss on HN