Website streamed live directly from a model

· web ai · Source ↗

TLDR

  • Flipbook renders every browser page as a real-time AI-generated image; navigation works by clicking image regions, with no HTML or code layer underneath.

Key Takeaways

  • Every page is a pixel-rendered image; text, layout, and navigation hotspots are all produced by the image model, no HTML layer.
  • Clicking any image region generates a new image exploring that element in more depth; the entire navigation graph is computed on demand.
  • Content sources from agentic web search plus model world knowledge; expected accuracy is comparable to ChatGPT, Gemini, or Claude.
  • Live video stream feature combines a custom optimized video generation model with the image system to animate pages and create seamless transitions.
  • Roadmap: real data integration, interactive actions, and data storage inside Flipbook; goal is replacing separate apps and websites entirely.

Hacker News Comment Review

  • Reliability split: one commenter reproduced a detailed car-suspension torque spec diagram with accurate figures and clickable drill-down components; another got fabricated and incoherent outputs on basic counting and an emoji’s history.
  • Gemini 429 rate-limit errors surfaced directly in the UI during the traffic spike, confirming Gemini as the image backend and making per-query inference costs visible to end users.
  • Skeptics frame it as too unreliable for any factual use case; defenders see the click-to-drill-down image paradigm as closer to a living illustrated reference manual than a chatbot interface.

Notable Comments

  • @giobox: generated correct torque specs for a specific car suspension with clickable components; called it “a living version of a classic illustrated Haynes workshop manual.”
  • @totallygeeky: fabricated the seahorse emoji backstory entirely and produced incomprehensible counting diagrams; argues unreliability makes the product only usable if you cannot detect the errors.

Original | Discuss on HN