Memory

Persistent memory systems for context-aware agents.

Memory Architecture

Riven Agents use a tiered memory system that balances speed, capacity, and shareability. Memory allows agents to maintain context across sessions, learn from past interactions, and coordinate with other agents.

The three memory tiers — short-term, long-term, and shared — can be configured independently and combined to match your agent's requirements.

Memory is optional. Stateless agents work without any memory configuration, but enabling memory significantly improves performance on recurring tasks.

Short-term Memory

Short-term memory holds the agent's current conversation context and active task state. It is stored in-memory and scoped to a single session. When the session ends, short-term memory is discarded unless explicitly persisted.

Short-term memory uses a sliding window strategy with configurable token limits. When the window fills, older messages are summarized and compressed to make room for new context.

agent-config.yaml

yaml

memory:
  short_term:
    backend: conversation
    max_tokens: 32000
    summarization: true
    summary_threshold: 24000

Short-term Memory Options

Parameter	Type	Default	Description
`backend`	string	`conversation`	Memory backend type
`max_tokens`	integer	`32000`	Maximum token window size
`summarization`	boolean	`false`	Enable automatic summarization of older context
`summary_threshold`	integer	`24000`	Token count at which summarization triggers

Long-term Memory

Long-term memory provides persistent storage using a vector database. Agents store important facts, decisions, and outcomes as embeddings that can be retrieved through semantic search in future sessions.

Riven supports pgvector as the default vector store backend, leveraging your existing PostgreSQL infrastructure. Embeddings are generated using the platform's embedding models and indexed for fast retrieval.

agent-config.yaml

yaml

memory:
  long_term:
    backend: pgvector
    connection: postgresql://agent:secret@db:5432/memory
    embedding_model: text-embedding-3-small
    dimensions: 1536
    retrieval:
      strategy: similarity
      top_k: 10
      score_threshold: 0.75

Vector Dimensions

The dimensions parameter must match the output dimensions of the embedding model you select:

Model	Dimensions	Notes
`text-embedding-3-small`	1536	Default. Good balance of quality and cost
`text-embedding-3-large`	3072	Higher quality, 2x storage cost
`riven-embed-v1` (self-hosted)	768	Runs on the AI platform, no external API calls

Retrieval Strategies

similarity — Pure cosine similarity search. Returns the top_k most similar memories above score_threshold.
mmr (Maximal Marginal Relevance) — Balances relevance with diversity to avoid returning near-duplicate memories.
hybrid — Combines vector similarity with BM25 keyword search for better recall on exact terms.

Limitations

Storage — Each agent's long-term memory is limited to 100,000 entries by default. Configure max_entries to adjust.
Embedding latency — Writing to long-term memory incurs embedding generation latency (typically 50-200ms per entry depending on the model).
Index rebuild — Adding more than 10,000 entries in a single session triggers an automatic HNSW index rebuild, which may briefly increase query latency.
Connection pooling — pgvector connections count against your PostgreSQL connection limit. Use PgBouncer for agents with high write throughput.

Shared Memory

Shared memory is a cross-agent knowledge base that enables collaboration. When one agent learns something valuable — like a deployment procedure or a codebase pattern — it can write to shared memory so other agents benefit from that knowledge.

Shared memory supports namespace isolation, access controls, and conflict resolution. Agents can read from multiple shared namespaces but write only to namespaces they have permission for.

Namespaces — Organize shared knowledge by team, project, or domain. An agent can subscribe to relevant namespaces.
Access Control — Read and write permissions are managed through the authorization service using OpenFGA policies.
Versioning — Shared memory entries are versioned, so agents always see the latest knowledge and can trace how it evolved.

Memory Configuration

Memory backends are configured in the agent's YAML configuration. You can mix and match backends to create the right memory profile for your use case.

agent-config.yaml

yaml

agent:
  name: ops-agent
  model: claude-opus-4-6
 
memory:
  short_term:
    backend: conversation
    max_tokens: 32000
    summarization: true
  long_term:
    backend: pgvector
    connection: postgresql://agent:secret@db:5432/memory
    embedding_model: text-embedding-3-small
    dimensions: 1536
  shared:
    backend: knowledge-base
    namespaces:
      - team-platform
      - org-runbooks
    write_namespace: team-platform

For production deployments, use connection pooling (e.g., PgBouncer) for the pgvector backend and set appropriate top_k limits to control retrieval latency.

Next Steps

Skills — Composable capabilities for agents.
Tool Use — Connect agents to external systems via MCP.
Agents Overview — Architecture and lifecycle of Riven Agents.

PreviousSkills NextTool Use