Open source memory layer so any AI agent can do what Claude.ai and ChatGPT do

· ai ai-agents databases · Source ↗

TLDR

  • Stash is an open-source, self-hosted MCP memory layer for any AI agent, backed by PostgreSQL + pgvector with a 6-stage synthesis pipeline.

Key Takeaways

  • Ships as a Docker Compose stack (3 commands): Postgres, pgvector, and an MCP server with 28 tools covering recall, goals, causal links, and contradiction resolution.
  • Memory is organized via hierarchical namespaces (e.g. /projects/restaurant-saas, /self/limits); recursive reads pull entire subtrees automatically.
  • Pipeline stages go beyond raw retrieval: Episodes → Facts → Relationships → Causal Links → Patterns → Contradictions, plus goal inference and failure-pattern extraction.
  • Model-agnostic by design: works with Claude, ChatGPT, Ollama, OpenRouter, vLLM, Groq, or any OpenAI-compatible endpoint; memory survives model switches.
  • STASH_VECTOR_DIM must be set before first run and cannot be changed; pgvector locks embedding dimension at initialization.

Hacker News Comment Review

  • Core skepticism: commenters argue Stash is effectively RAG under a different name – pgvector plus MCP recall/remember functions – with no benchmarks proving retrieval is actually better than a folder of markdown files and grep.
  • The passive vs. active memory distinction drew pointed pushback: Claude.ai’s background summarization (no agent write calls required) is seen as meaningfully different and more reliable than requiring the agent to explicitly call remember; the elaborate pipeline is moot if the agent forgets to invoke it.
  • Stale memory contamination is an underexplored failure mode: past decisions (e.g. “avoid Stripe”) could bias unrelated future sessions for weeks, and there is no described mechanism for expiry or relevance decay.

Notable Comments

  • @aprilnya: Distinguishes Stash’s explicit store/recall model from Claude.ai’s passive background summarization, calling the latter “much, much better.”
  • @jFriedensreich: Flags memory rot and cross-session contamination as an underaddressed failure mode for long-running projects.
  • @cush: “There’s no proof in the site this is better than RAG or even a folder of memory files and grep” – questions whether any evaluation exists.

Original | Discuss on HN