Documentation

Read the source code of your memory.

Docs / core-concepts / architecture

Architecture

Three-tier services, three storage layers, the search pipeline, and the knowledge integration engine.

Orion is a three-tier system: a standalone FastAPI REST API, a standalone MCP server, and a React frontend — each running as its own Docker service and communicating over HTTP.

System overview

Clients (Claude Code, Cursor, CLI, Web)
  │
  ├── MCP (28 tools) ─────────────────────────────────────┐
  │   http://localhost:8787/mcp                            │
  │                                                        ▼
  │                                           ┌────────────────────────┐
  │                                           │  orion-mcp (port 8787) │
  │                                           │  FastMCP — stateless   │
  │                                           │  HTTP client → API     │
  │                                           └───────────┬────────────┘
  │                                                       │ HTTP
  └── REST /api/v1/* (101 endpoints) ─────────────────────┤
      http://localhost:8000                               │
                                                          ▼
                                         ┌────────────────────────────────┐
                                         │  orion-api (port 8000)         │
                                         │  FastAPI — all business logic  │
                                         │                                │
                                         │  search · stardust · graph     │
                                         │  orientation · calibration     │
                                         │  contradiction · synthesis     │
                                         │  planet_assignment · audit     │
                                         │                                │
                                         │  ┌───────┐ ┌───────┐ ┌──────┐ │
                                         │  │ Redis │ │Chroma │ │  PG  │ │
                                         │  └───────┘ └───────┘ └──────┘ │
                                         └────────────────────────────────┘

The MCP server is stateless: it holds no data and no database connections. All 28 tools are thin wrappers that call the REST API. This means every Orion feature is accessible through REST alone — the MCP layer is purely a convenience interface for AI agents.

Storage layers

Redis — hot cache

Working memory. Recently written/accessed stardust, active sessions, computed results.

Per-region TTLs (empathetic: 1h → strategic: 7d)
Session tracking with 5-minute idle timeout
Dashboard and strength score caching (15 min / 1 hr)
Synthesis result caching (30 min)

ChromaDB — semantic vectors

Embedding-based similarity search. Every stardust record is embedded and stored in a collection partitioned by galaxy_id × region.

Collections: orion_{galaxy_id}_{region} (7 per galaxy)
Distance metric: Cosine (HNSW index)
Embedding providers: Ollama (nomic-embed-text) or Google (text-embedding-004)

PostgreSQL — structural spine

All relational data: Galaxy hierarchy, knowledge graph, agent identities, audit logs, user accounts.

2 Alembic migrations (auto-applied on API startup)
Async via asyncpg (PostgreSQL) or aiosqlite (SQLite for local dev)
JSONB columns with GIN indexes for tags and metadata
Connection pooling: 10 base + 20 overflow

Search pipeline

Search uses Reciprocal Rank Fusion (RRF) to combine three signal sources:

Query
  ├─→ Redis cache scan (keyword substring match)
  ├─→ ChromaDB semantic search (per planet × per region)
  │     ├─→ Semantic ranking (cosine similarity)
  │     ├─→ Recency ranking (similarity × 1/(1 + days_old))
  │     └─→ Confidence ranking (similarity × stored_confidence)
  └─→ RRF fusion: score(d) = Σ 1/(k + rank + 1), k=60
        └─→ Deduplicate → Enrich from Postgres → Return

RRF operates on rank positions, not raw scores — so it fuses rankings from completely different scoring systems without normalization. See the RRF blog post for benchmarks and tuning details.

Knowledge integration engine

Every brain.think call triggers an 8-step pipeline:

brain.think("FastAPI replaced Flask for the API layer")
  │
  ├─→ 0. Route to Planet (6 strategies + inbox fallback)
  ├─→ 1. Write stardust record to Postgres
  ├─→ 2. Embed content → upsert to ChromaDB
  ├─→ 3. Extract entities (regex-based, zero LLM)
  ├─→ 4. Extract typed relationships (USES, REPLACES, DEPENDS_ON, ...)
  ├─→ 5. Process supersession chains (archive old, create SUPERSEDES edges)
  ├─→ 6. Update entity backlinks
  ├─→ 7. Update agent expertise profile
  └─→ 8. Log integration event to Nebula

All steps are zero-LLM by default — entity and relationship extraction use regex patterns. The optional LLM pass adds ~130ms but pushes entity coverage from ~60% to ~90%.

Total pipeline latency: ~200ms on a laptop (with LLM), ~70ms without.

Data model

Galaxy
├── Sun (7 sections)
├── Agent Identities
├── Knowledge Graph (entities + relationships)
├── Users (auth, roles, planet assignments)
└── Planets
      └── Biomes (SEED → ACTIVE → MATURE → DORMANT → ARCHIVED)
            ├── Stardust (content, region, gravity, confidence)
            └── Entities (name, type, tier 1–3)

Key tables: galaxies, planets, biomes, stardust, entities, entity_relationships, entity_backlinks, agent_identities, routing_log, graph_path_cache. See the source code for full schemas.

Next: Quickstart →