Memory
Persistent memory systems for context-aware agents.
Memory Architecture
Riven Agents use a tiered memory system that balances speed, capacity, and shareability. Memory allows agents to maintain context across sessions, learn from past interactions, and coordinate with other agents.
The three memory tiers — short-term, long-term, and shared — can be configured independently and combined to match your agent's requirements.
Memory is optional. Stateless agents work without any memory configuration, but enabling memory significantly improves performance on recurring tasks.
Short-term Memory
Short-term memory holds the agent's current conversation context and active task state. It is stored in-memory and scoped to a single session. When the session ends, short-term memory is discarded unless explicitly persisted.
Short-term memory uses a sliding window strategy with configurable token limits. When the window fills, older messages are summarized and compressed to make room for new context.
memory:
short_term:
backend: conversation
max_tokens: 32000
summarization: true
summary_threshold: 24000Short-term Memory Options
| Parameter | Type | Default | Description |
|---|---|---|---|
backend | string | conversation | Memory backend type |
max_tokens | integer | 32000 | Maximum token window size |
summarization | boolean | false | Enable automatic summarization of older context |
summary_threshold | integer | 24000 | Token count at which summarization triggers |
Long-term Memory
Long-term memory provides persistent storage using a vector database. Agents store important facts, decisions, and outcomes as embeddings that can be retrieved through semantic search in future sessions.
Riven supports pgvector as the default vector store backend, leveraging your existing PostgreSQL infrastructure. Embeddings are generated using the platform's embedding models and indexed for fast retrieval.
memory:
long_term:
backend: pgvector
connection: postgresql://agent:secret@db:5432/memory
embedding_model: text-embedding-3-small
dimensions: 1536
retrieval:
strategy: similarity
top_k: 10
score_threshold: 0.75Vector Dimensions
The dimensions parameter must match the output dimensions of the embedding model you select:
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small | 1536 | Default. Good balance of quality and cost |
text-embedding-3-large | 3072 | Higher quality, 2x storage cost |
riven-embed-v1 (self-hosted) | 768 | Runs on the AI platform, no external API calls |
Retrieval Strategies
- similarity — Pure cosine similarity search. Returns the
top_kmost similar memories abovescore_threshold. - mmr (Maximal Marginal Relevance) — Balances relevance with diversity to avoid returning near-duplicate memories.
- hybrid — Combines vector similarity with BM25 keyword search for better recall on exact terms.
Limitations
- Storage — Each agent's long-term memory is limited to 100,000 entries by default. Configure
max_entriesto adjust. - Embedding latency — Writing to long-term memory incurs embedding generation latency (typically 50-200ms per entry depending on the model).
- Index rebuild — Adding more than 10,000 entries in a single session triggers an automatic HNSW index rebuild, which may briefly increase query latency.
- Connection pooling — pgvector connections count against your PostgreSQL connection limit. Use PgBouncer for agents with high write throughput.
Shared Memory
Shared memory is a cross-agent knowledge base that enables collaboration. When one agent learns something valuable — like a deployment procedure or a codebase pattern — it can write to shared memory so other agents benefit from that knowledge.
Shared memory supports namespace isolation, access controls, and conflict resolution. Agents can read from multiple shared namespaces but write only to namespaces they have permission for.
- Namespaces — Organize shared knowledge by team, project, or domain. An agent can subscribe to relevant namespaces.
- Access Control — Read and write permissions are managed through the authorization service using OpenFGA policies.
- Versioning — Shared memory entries are versioned, so agents always see the latest knowledge and can trace how it evolved.
Memory Configuration
Memory backends are configured in the agent's YAML configuration. You can mix and match backends to create the right memory profile for your use case.
agent:
name: ops-agent
model: claude-opus-4-6
memory:
short_term:
backend: conversation
max_tokens: 32000
summarization: true
long_term:
backend: pgvector
connection: postgresql://agent:secret@db:5432/memory
embedding_model: text-embedding-3-small
dimensions: 1536
shared:
backend: knowledge-base
namespaces:
- team-platform
- org-runbooks
write_namespace: team-platformFor production deployments, use connection pooling (e.g., PgBouncer) for the pgvector backend and set appropriate top_k limits to control retrieval latency.
Next Steps
- Skills — Composable capabilities for agents.
- Tool Use — Connect agents to external systems via MCP.
- Agents Overview — Architecture and lifecycle of Riven Agents.