Skip to main content

Overview

Memory systems in OrbitAI enable agents to retain and recall information across conversations and executions. By providing different types of memory—short-term, long-term, and entity memory—agents can maintain context, learn from past interactions, and build sophisticated knowledge about users, tasks, and environments.

Contextual

Maintain conversation context across multiple turns

Persistent

Store information across sessions and executions

Intelligent

Retrieve relevant memories using semantic search

Entity-Aware

Track and remember people, places, and organizations

Configurable

Fine-tune memory behavior with detailed configuration

Efficient

Automatic pruning and compression for optimal performance

Key Capabilities

OrbitAI supports three distinct memory types that can be used independently or together: short-term memory for conversations, long-term memory for persistence, and entity memory for tracking named entities.
Memory uses embedding-based semantic search to retrieve the most relevant information for current tasks, ensuring agents have access to pertinent past knowledge.
Built-in mechanisms for memory compression, summarization, and pruning prevent memory bloat while maintaining important information.
Long-term memory persists to disk, allowing agents to maintain knowledge across application restarts and sessions.

Memory Architecture

Memory System
    ├── Memory Types
    │   ├── Short-Term Memory
    │   │   ├── Conversation Context
    │   │   ├── Task Sequence State
    │   │   └── Session-Scoped
    │   │
    │   ├── Long-Term Memory
    │   │   ├── Persistent Storage
    │   │   ├── Cross-Session Knowledge
    │   │   └── Historical Data
    │   │
    │   └── Entity Memory
    │       ├── Named Entity Tracking
    │       ├── Relationship Graphs
    │       └── Entity Attributes

    ├── Memory Storage
    │   ├── MemoryStorage (Interface)
    │   ├── In-Memory Store
    │   ├── Persistent Store
    │   └── Embedding Index

    ├── Memory Configuration
    │   ├── Storage Limits
    │   ├── Persistence Settings
    │   ├── Embedding Configuration
    │   ├── Retrieval Thresholds
    │   └── Maintenance Rules

    └── Memory Operations
        ├── Store
        ├── Retrieve (Semantic Search)
        ├── Update
        ├── Delete
        ├── Compress
        └── Prune

Memory Types

OrbitAI provides three complementary memory systems, each designed for specific use cases:

Short-Term Memory

Short-Term Memory

Retains information within a single conversation or task sequence. Ideal for maintaining context in multi-turn interactions.
Configuration Parameter: memory: Bool (default: false) Scope: Current session only (in-memory) Use Cases:
  • Multi-turn conversations where context is needed
  • Sequential task workflows with dependencies
  • Conversational agents that reference prior exchanges
  • Building context progressively within a session
Example:
let conversationalAgent = Agent(
    role: "Customer Support Assistant",
    purpose: "Help customers with their inquiries",
    context: """
    Friendly support agent who remembers the conversation
    and can reference previous topics discussed.
    """,
    memory: true  // Enable short-term memory
)

// Usage scenario:
// User: "I need help with my order"
// Agent: "I'd be happy to help! Can you provide your order number?"
// User: "It's 12345"
// Agent: "Thank you! Let me look up order 12345 for you..."
//       [Agent remembers the order number from previous message]
  • Benefits
  • Limitations
  • When to Use
Fast Access: In-memory storage for quick retrieval ✅ Context Continuity: Maintains conversation flow ✅ No Persistence Overhead: Clears when session ends ✅ Privacy Friendly: Data doesn’t persist after session

Long-Term Memory

Long-Term Memory

Persists information across sessions and executions. Perfect for tracking user preferences, historical data, and learning over time.
Configuration Parameter: longTermMemory: Bool (default: false) Scope: Cross-session (persistent storage) Use Cases:
  • User preference tracking and personalization
  • Historical interaction analysis
  • Learning from past successes and failures
  • Building knowledge bases over time
Example:
let personalAssistant = Agent(
    role: "Personal Assistant",
    purpose: "Provide personalized help based on user history",
    context: """
    Remembers user preferences, past requests, and habits
    to provide increasingly personalized assistance.
    """,
    longTermMemory: true  // Enable persistent memory
)

// Usage scenario:
// Session 1:
// User: "I prefer emails in the morning at 8 AM"
// Agent: "Got it! I'll remember to send summaries at 8 AM"
//       [Stores preference in long-term memory]
//
// Session 2 (next day):
// Agent: "Good morning! Here's your 8 AM email summary"
//       [Retrieved preference from previous session]
  • Benefits
  • Limitations
  • When to Use
Cross-Session: Remembers across app restarts ✅ User Personalization: Learns preferences over time ✅ Historical Context: Access to past interactions ✅ Knowledge Accumulation: Builds expertise progressively

Entity Memory

Entity Memory

Tracks and remembers named entities—people, places, organizations, products—and their attributes and relationships.
Configuration Parameter: entityMemory: Bool (default: false) Scope: Entity graph (in-memory or persistent) Use Cases:
  • Customer relationship management (CRM)
  • Knowledge graph construction
  • Entity relationship tracking
  • Person/place/organization awareness
Example:
let crmAgent = Agent(
    role: "Sales Assistant",
    purpose: "Manage customer relationships and track interactions",
    context: """
    Expert at remembering customer details, preferences,
    and interaction history to provide personalized service.
    """,
    entityMemory: true  // Enable entity tracking
)

// Usage scenario:
// User: "John Smith from Acme Corp called about the Enterprise plan"
// Agent: "I'll note that John Smith (Acme Corp) is interested in Enterprise"
//       [Creates entities: Person(John Smith), Organization(Acme Corp)]
//       [Links: John Smith WORKS_AT Acme Corp]
//       [Links: Acme Corp INTERESTED_IN Enterprise Plan]
//
// Later:
// User: "What did John want?"
// Agent: "John Smith from Acme Corp was interested in the Enterprise plan"
//       [Retrieved entity relationships]
  • Benefits
  • Limitations
  • When to Use
Entity Recognition: Automatically identifies entities ✅ Relationship Tracking: Maintains entity connections ✅ Structured Knowledge: Organized entity graphs ✅ Contextual Awareness: Understands entity context

Memory Type Combinations

You can enable multiple memory types simultaneously for sophisticated agents:
let sophisticatedAgent = Agent(
    role: "Executive Assistant",
    purpose: "Comprehensive personal and professional assistance",
    context: """
    Advanced assistant with full memory capabilities:
    - Remembers conversations (short-term)
    - Learns preferences over time (long-term)
    - Tracks contacts and relationships (entity)
    """,
    memory: true,           // Short-term: conversation context
    longTermMemory: true,   // Long-term: persistent preferences
    entityMemory: true      // Entity: people and organizations
)
Recommendation: Start with short-term memory only, then add long-term and entity memory as specific needs arise. Each type adds overhead, so enable only what you need.

Memory Configuration

The MemoryConfiguration object provides fine-grained control over memory behavior, storage, and performance characteristics.

Configuration Parameters

maxMemoryItems
Int
default:"100"
Maximum number of memory items to store before automatic pruning.Range: 10-10000 Recommendation: 50-100 for most use cases
persistencePath
String
default:"./memory"
File system path where long-term memory is persisted.Examples:
  • "./memory" - Default location
  • "./data/agent-memory" - Custom directory
  • "~/Documents/OrbitMemory" - User directory
embeddingModel
String
default:"text-embedding-ada-002"
Embedding model used for semantic memory retrieval.Options:
  • "text-embedding-ada-002" - OpenAI (high quality)
  • "text-embedding-3-small" - OpenAI (efficient)
  • "text-embedding-3-large" - OpenAI (highest quality)
similarityThreshold
Double
default:"0.7"
Minimum similarity score (0.0-1.0) for memory retrieval.Range: 0.0 (retrieve all) to 1.0 (exact match only) Recommendation:
  • 0.6-0.7 - Broad retrieval
  • 0.75-0.8 - Balanced (recommended)
  • 0.85-0.95 - Precise retrieval
compressionEnabled
Bool
default:"false"
Enable automatic memory compression and summarization.When enabled, older memory items are automatically summarized to reduce storage while retaining key information.
autoSummarize
Bool
default:"false"
Automatically summarize memory items after a threshold.Works with compressionEnabled to create concise memory representations.
pruneOldItems
Bool
default:"false"
Automatically remove oldest memory items when maxMemoryItems is reached.Strategy: FIFO (First In, First Out) or importance-based ranking

Configuration Examples

Basic Memory Configuration

let basicConfig = MemoryConfiguration(
    maxMemoryItems: 100,
    persistencePath: "./memory",
    similarityThreshold: 0.75
)

let agent = Agent(
    role: "Assistant",
    purpose: "General assistance with memory",
    context: "Helpful assistant",
    memory: true,
    longTermMemory: true,
    memoryConfig: basicConfig
)

High-Quality Memory Configuration

let highQualityConfig = MemoryConfiguration(
    maxMemoryItems: 200,
    persistencePath: "./memory/high-quality",
    embeddingModel: "text-embedding-3-large",  // Best quality
    similarityThreshold: 0.85,                 // Precise retrieval
    compressionEnabled: false,                 // Keep full detail
    pruneOldItems: false                       // Keep all items
)

let premiumAgent = Agent(
    role: "Premium Assistant",
    purpose: "High-quality personalized assistance",
    context: "Premium service with excellent memory",
    memory: true,
    longTermMemory: true,
    memoryConfig: highQualityConfig
)

Memory-Efficient Configuration

let efficientConfig = MemoryConfiguration(
    maxMemoryItems: 50,                        // Smaller limit
    persistencePath: "./memory/compact",
    embeddingModel: "text-embedding-3-small",  // Efficient model
    similarityThreshold: 0.7,                  // Broader retrieval
    compressionEnabled: true,                  // Auto-compress
    autoSummarize: true,                       // Summarize old items
    pruneOldItems: true                        // Remove old entries
)

let efficientAgent = Agent(
    role: "Efficient Assistant",
    purpose: "Memory-efficient assistance",
    context: "Optimized for resource constraints",
    memory: true,
    longTermMemory: true,
    memoryConfig: efficientConfig
)

Large-Scale Memory Configuration

let largeScaleConfig = MemoryConfiguration(
    maxMemoryItems: 1000,                      // Large capacity
    persistencePath: "./memory/large-scale",
    embeddingModel: "text-embedding-3-large",
    similarityThreshold: 0.80,
    compressionEnabled: true,                  // Manage size
    autoSummarize: true,                       // Keep summaries
    pruneOldItems: true                        // Auto-prune
)

let knowledgeAgent = Agent(
    role: "Knowledge Assistant",
    purpose: "Manage large knowledge bases",
    context: "Expert with extensive memory",
    memory: true,
    longTermMemory: true,
    entityMemory: true,
    memoryConfig: largeScaleConfig
)

Enabling and Disabling Memory

Memory can be configured at both agent and orbit levels, providing flexibility for different architectural patterns.

Agent-Level Memory

Enable memory for individual agents to give them context retention capabilities:
// Agent with all memory types
let fullMemoryAgent = Agent(
    role: "Personal Assistant",
    purpose: "Comprehensive personal assistance",
    context: "Assistant with full memory capabilities",
    memory: true,           // Short-term memory
    longTermMemory: true,   // Persistent memory
    entityMemory: true,     // Entity tracking
    memoryConfig: memoryConfig
)

// Agent with selective memory
let selectiveAgent = Agent(
    role: "Conversational Agent",
    purpose: "Handle conversations",
    context: "Conversational interface",
    memory: true,           // Only short-term
    longTermMemory: false,  // No persistence
    entityMemory: false     // No entity tracking
)

// Agent with no memory
let statelessAgent = Agent(
    role: "Simple Responder",
    purpose: "Answer single questions",
    context: "Stateless question answering",
    memory: false,          // No memory overhead
    longTermMemory: false,
    entityMemory: false
)
  • When to Use
  • Memory Isolation
  • Best Practices
Agent-level memory is ideal when:
  • Different agents have different memory needs
  • Some agents need persistence, others don’t
  • Fine-grained control over memory usage
  • Each agent maintains separate context
Example:
let researcher = Agent(
    role: "Researcher",
    memory: true,         // Needs conversation context
    longTermMemory: true  // Stores findings
)

let calculator = Agent(
    role: "Calculator",
    memory: false  // Stateless calculations
)

Orbit-Level Memory

Enable memory at the orbit level for shared memory across all agents:
let sharedMemoryOrbit = try await Orbit.create(
    name: "Collaborative Workflow",
    agents: [agent1, agent2, agent3],
    tasks: tasks,
    memory: true,           // Shared short-term memory
    longTermMemory: true,   // Shared persistent memory
    entityMemory: true,     // Shared entity tracking
    memoryConfig: memoryConfig
)
  • Shared Memory
  • Memory Priority
  • When to Use
Orbit-level memory creates a shared memory space accessible to all agents in the orbit:
let orbit = try await Orbit.create(
    name: "Team Collaboration",
    agents: [researcher, analyst, writer],
    tasks: [researchTask, analysisTask, writingTask],
    memory: true  // All agents share this memory
)

// Execution flow:
// 1. Researcher stores findings in memory
// 2. Analyst accesses researcher's findings
// 3. Writer uses both researcher and analyst memory
// All agents see the same shared memory
Benefits:
  • Agents can build on each other’s work
  • Context flows naturally through workflow
  • Reduces redundant information storage
  • Natural collaboration pattern

Disabling Memory

Explicitly disable memory when not needed to save resources:
// Explicitly no memory
let noMemoryAgent = Agent(
    role: "Stateless Worker",
    purpose: "Process independent tasks",
    context: "Simple task processor",
    memory: false,
    longTermMemory: false,
    entityMemory: false
)

// Default is no memory (can omit parameters)
let defaultAgent = Agent(
    role: "Default Agent",
    purpose: "Basic task execution",
    context: "Simple agent"
    // memory defaults to false
)
Memory adds overhead: Each enabled memory type consumes CPU, memory, and storage. Disable memory when agents don’t need to retain context between interactions.

Memory Configuration Best Practices

Start Simple

Begin with short-term memory only. Add long-term and entity memory as specific needs emerge.
// Phase 1: Start here
memory: true

// Phase 2: Add if needed
memory: true,
longTermMemory: true

// Phase 3: Add if needed
memory: true,
longTermMemory: true,
entityMemory: true

Match Use Case

Enable memory types that match your use case:Chatbot: memory: true Personal assistant: memory: true, longTermMemory: true CRM: All three types API worker: memory: false

Configure Limits

Always set appropriate memory limits:
let config = MemoryConfiguration(
    maxMemoryItems: 100,    // Set limit
    compressionEnabled: true,  // Auto-compress
    pruneOldItems: true     // Auto-prune
)

Monitor Usage

Track memory usage and adjust configuration:
// Log memory stats
print("Memory items: \(memoryStorage.count)")
print("Storage size: \(memoryStorage.sizeInBytes)")

// Adjust if needed
if memoryStorage.count > 80 {
    config.maxMemoryItems = 50
}

Memory Operations

Memory systems provide several operations for storing, retrieving, and managing information.

Accessing Memory

Memory is accessed through the TaskExecutionContext:
public struct TaskExecutionContext: Sendable {
    /// Previous task outputs
    public var taskOutputs: [TaskOutput]

    /// Orbit-level inputs
    public var inputs: Metadata

    /// Shared memory storage
    public var memory: MemoryStorage?

    /// Knowledge base access
    public var knowledgeBase: KnowledgeBase?

    /// Available tools
    public var availableTools: [String]
}
Agents automatically use memory through the execution context—no manual memory operations needed in most cases.

Memory Storage Interface

The MemoryStorage protocol defines memory operations:
public protocol MemoryStorage: Sendable {
    /// Store a memory item
    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws

    /// Retrieve memories by semantic search
    func retrieve(
        query: String,
        limit: Int,
        threshold: Double?
    ) async throws -> [MemoryItem]

    /// Get specific memory by key
    func get(key: String) async throws -> MemoryItem?

    /// Update existing memory
    func update(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws

    /// Delete memory item
    func delete(key: String) async throws

    /// Clear all memories
    func clear() async throws

    /// Get memory count
    var count: Int { get async }

    /// Persist to storage
    func persist() async throws

    /// Load from storage
    func load() async throws
}

Memory Item Structure

public struct MemoryItem: Codable, Sendable {
    public let key: String
    public let value: String
    public let metadata: [String: String]
    public let timestamp: Date
    public let embedding: [Double]?
    public var accessCount: Int
    public var lastAccessed: Date
}

Automatic Memory Management

OrbitAI automatically manages memory without manual intervention:
// Memory is automatically:
// 1. Stored when agents generate information
// 2. Retrieved when relevant to current task
// 3. Updated when information changes
// 4. Persisted when orbit completes
// 5. Pruned when limits are reached

let orbit = try await Orbit.create(
    name: "Automated Memory",
    agents: [agent],
    tasks: [task],
    memory: true,
    longTermMemory: true,
    memoryConfig: MemoryConfiguration(
        maxMemoryItems: 100,
        compressionEnabled: true,
        pruneOldItems: true  // Automatic management
    )
)

let output = try await orbit.run()
// Memory automatically persisted on completion

Manual Memory Operations

For advanced use cases, you can manually interact with memory:
import OrbitAI

// Custom task with manual memory access
let customTask = ORTask(
    description: "Store and retrieve specific information",
    expectedOutput: "Memory operation results",
    customHandler: { context in
        guard let memory = context.memory else {
            return "Memory not available"
        }

        // Store information
        try await memory.store(
            key: "user_preference",
            value: "Morning emails at 8 AM",
            metadata: ["category": "preferences", "priority": "high"]
        )

        // Retrieve by semantic search
        let results = try await memory.retrieve(
            query: "email preferences",
            limit: 5,
            threshold: 0.75
        )

        // Get specific item
        if let preference = try await memory.get(key: "user_preference") {
            print("Found preference: \(preference.value)")
        }

        // Update existing memory
        try await memory.update(
            key: "user_preference",
            value: "Morning emails at 7 AM",  // Updated time
            metadata: ["category": "preferences", "priority": "high"]
        )

        return "Memory operations completed"
    }
)

Memory Lifecycle

Memory Lifecycle in Orbit Execution
    ├── 1. Initialization
    │   ├── Load existing memories (if long-term enabled)
    │   └── Initialize in-memory storage

    ├── 2. During Execution
    │   ├── Agent generates output → Stored in memory
    │   ├── Agent needs context → Retrieved from memory
    │   ├── Information updates → Memory updated
    │   └── Memory limit reached → Automatic pruning

    ├── 3. Retrieval Process
    │   ├── Query generated from task context
    │   ├── Query embedded using embedding model
    │   ├── Similarity search against memory embeddings
    │   ├── Results filtered by threshold
    │   └── Top-k results returned to agent

    └── 4. Completion
        ├── Memory compression (if enabled)
        ├── Memory persistence (if long-term enabled)
        └── Memory cleanup (short-term cleared)

Advanced Memory Setup

Custom Memory Storage

Implement custom memory storage for specialized backends:
import OrbitAI

final class CustomMemoryStorage: MemoryStorage {
    private var storage: [String: MemoryItem] = [:]
    private let database: Database  // Custom database

    var count: Int {
        get async { storage.count }
    }

    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        let item = MemoryItem(
            key: key,
            value: value,
            metadata: metadata ?? [:],
            timestamp: Date(),
            embedding: try await generateEmbedding(for: value),
            accessCount: 0,
            lastAccessed: Date()
        )

        storage[key] = item

        // Store in custom database
        try await database.insert(item)
    }

    func retrieve(
        query: String,
        limit: Int,
        threshold: Double?
    ) async throws -> [MemoryItem] {
        let queryEmbedding = try await generateEmbedding(for: query)
        let threshold = threshold ?? 0.7

        // Semantic search
        let scored = storage.values.compactMap { item -> (MemoryItem, Double)? in
            guard let embedding = item.embedding else { return nil }
            let similarity = cosineSimilarity(queryEmbedding, embedding)
            return similarity >= threshold ? (item, similarity) : nil
        }

        // Sort by similarity and take top-k
        return scored
            .sorted { $0.1 > $1.1 }
            .prefix(limit)
            .map { $0.0 }
    }

    func get(key: String) async throws -> MemoryItem? {
        storage[key]
    }

    func update(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        guard var item = storage[key] else {
            throw MemoryError.itemNotFound
        }

        item.value = value
        item.metadata = metadata ?? item.metadata
        item.lastAccessed = Date()

        storage[key] = item
        try await database.update(item)
    }

    func delete(key: String) async throws {
        storage.removeValue(forKey: key)
        try await database.delete(key: key)
    }

    func clear() async throws {
        storage.removeAll()
        try await database.deleteAll()
    }

    func persist() async throws {
        // Custom persistence logic
        try await database.saveAll(Array(storage.values))
    }

    func load() async throws {
        // Custom loading logic
        let items = try await database.loadAll()
        storage = Dictionary(uniqueKeysWithValues: items.map { ($0.key, $0) })
    }

    private func generateEmbedding(for text: String) async throws -> [Double] {
        // Call embedding API
        // Return embedding vector
    }

    private func cosineSimilarity(_ a: [Double], _ b: [Double]) -> Double {
        // Calculate cosine similarity
    }
}

Memory with Vector Databases

Integrate with vector databases for scalable memory:
import OrbitAI
import Pinecone  // Example vector database

final class VectorMemoryStorage: MemoryStorage {
    private let pinecone: PineconeClient
    private let indexName: String
    private let embeddingModel: EmbeddingModel

    init(
        apiKey: String,
        indexName: String,
        embeddingModel: EmbeddingModel
    ) {
        self.pinecone = PineconeClient(apiKey: apiKey)
        self.indexName = indexName
        self.embeddingModel = embeddingModel
    }

    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        // Generate embedding
        let embedding = try await embeddingModel.embed(text: value)

        // Store in Pinecone
        try await pinecone.upsert(
            index: indexName,
            vectors: [
                Vector(
                    id: key,
                    values: embedding,
                    metadata: metadata ?? [:]
                )
            ]
        )
    }

    func retrieve(
        query: String,
        limit: Int,
        threshold: Double?
    ) async throws -> [MemoryItem] {
        // Generate query embedding
        let queryEmbedding = try await embeddingModel.embed(text: query)

        // Query Pinecone
        let results = try await pinecone.query(
            index: indexName,
            vector: queryEmbedding,
            topK: limit,
            includeMetadata: true
        )

        // Convert to MemoryItems
        return results.matches.compactMap { match in
            guard let threshold = threshold,
                  match.score >= threshold else { return nil }

            return MemoryItem(
                key: match.id,
                value: match.metadata["value"] as? String ?? "",
                metadata: match.metadata as? [String: String] ?? [:],
                timestamp: Date(),
                embedding: match.values,
                accessCount: 0,
                lastAccessed: Date()
            )
        }
    }

    // Implement other methods...
}

Memory Caching

Add caching layer for frequently accessed memories:
final class CachedMemoryStorage: MemoryStorage {
    private let underlying: MemoryStorage
    private var cache: [String: (item: MemoryItem, expiry: Date)] = [:]
    private let cacheExpiry: TimeInterval = 300  // 5 minutes

    init(underlying: MemoryStorage) {
        self.underlying = underlying
    }

    func get(key: String) async throws -> MemoryItem? {
        // Check cache first
        if let cached = cache[key],
           cached.expiry > Date() {
            return cached.item
        }

        // Fetch from underlying storage
        if let item = try await underlying.get(key: key) {
            // Update cache
            cache[key] = (item, Date().addingTimeInterval(cacheExpiry))
            return item
        }

        return nil
    }

    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        // Store in underlying
        try await underlying.store(key: key, value: value, metadata: metadata)

        // Invalidate cache
        cache.removeValue(forKey: key)
    }

    func retrieve(
        query: String,
        limit: Int,
        threshold: Double?
    ) async throws -> [MemoryItem] {
        // For semantic search, bypass cache
        return try await underlying.retrieve(
            query: query,
            limit: limit,
            threshold: threshold
        )
    }

    // Implement other methods with cache invalidation...
}

Memory Compression

Implement custom compression strategies:
final class CompressingMemoryStorage: MemoryStorage {
    private let underlying: MemoryStorage
    private let llm: LLMProvider
    private let compressionThreshold: Int = 50

    init(underlying: MemoryStorage, llm: LLMProvider) {
        self.underlying = underlying
        self.llm = llm
    }

    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        var finalValue = value

        // Compress long values
        if value.count > compressionThreshold {
            let prompt = """
            Summarize the following information concisely while retaining key details:

            \(value)
            """

            let summary = try await llm.generateResponse(
                for: LLMRequest(messages: [.user(prompt)])
            )

            finalValue = summary.content

            // Mark as compressed
            var newMetadata = metadata ?? [:]
            newMetadata["compressed"] = "true"
            newMetadata["original_length"] = "\(value.count)"

            try await underlying.store(
                key: key,
                value: finalValue,
                metadata: newMetadata
            )
        } else {
            try await underlying.store(
                key: key,
                value: finalValue,
                metadata: metadata
            )
        }
    }

    // Delegate other methods to underlying storage
}

Memory vs Context Window

Understanding when to use memory systems versus the LLM’s context window:
  • Memory Systems
  • Context Window
  • Hybrid Approach
  • Decision Guide
Memory Systems:
  • Store information externally (outside context window)
  • Retrieve relevant data as needed via semantic search
  • Not limited by token count
  • Slower access (retrieval step required)
  • Best for: Large knowledge bases, long-term retention
Example:
let agent = Agent(
    role: "Knowledge Assistant",
    purpose: "Answer questions using large knowledge base",
    longTermMemory: true,  // Store 1000s of items
    memoryConfig: MemoryConfiguration(
        maxMemoryItems: 5000  // Far exceeds context window
    )
)
Context Window Management: When respectContextWindow: true, OrbitAI automatically prunes old messages when approaching the model’s token limit while retaining system messages and recent context.

Best Practices

Memory Configuration Best Practices

Enable Selectively

Only enable memory types you actually need:Good:
// Chatbot needs conversation memory
memory: true,
longTermMemory: false
Bad:
// Overkill for simple chatbot
memory: true,
longTermMemory: true,
entityMemory: true

Set Appropriate Limits

Configure memory limits based on use case:Quick tasks: 20-50 items General agents: 50-100 items Knowledge workers: 100-500 items Large scale: 500-5000 items
let config = MemoryConfiguration(
    maxMemoryItems: 100  // Match your needs
)

Use Compression

Enable compression for long-running agents:
let config = MemoryConfiguration(
    compressionEnabled: true,
    autoSummarize: true,
    pruneOldItems: true
)
Benefits:
  • Reduced storage usage
  • Better retrieval performance
  • Automatic maintenance

Tune Similarity Threshold

Adjust threshold based on precision needs:Broad retrieval: 0.6-0.7 Balanced: 0.75-0.8 (recommended) Precise: 0.85-0.95
let config = MemoryConfiguration(
    similarityThreshold: 0.75  // Start here
)

Organize by Path

Use descriptive persistence paths:
// Good organization
persistencePath: "./memory/agents/assistant"
persistencePath: "./memory/agents/researcher"
persistencePath: "./memory/orbits/workflow-1"

// Avoids conflicts and aids debugging

Monitor Memory Growth

Track and manage memory usage:
// Log memory stats periodically
print("Items: \(memory.count)")
print("Size: \(memory.sizeInBytes)")

// Adjust configuration if needed
if memory.count > 80% of max {
    // Increase limit or enable pruning
}

Architecture Best Practices

  • Agent Memory Design
  • Orbit Memory Design
  • Scaling Considerations
Design agents with appropriate memory for their role:
// Stateless worker - no memory
let workerAgent = Agent(
    role: "Data Processor",
    purpose: "Process data batches",
    memory: false
)

// Conversational - short-term only
let chatAgent = Agent(
    role: "Chat Assistant",
    purpose: "Engage in conversations",
    memory: true,
    longTermMemory: false
)

// Personal assistant - full memory
let personalAgent = Agent(
    role: "Personal Assistant",
    purpose: "Personalized assistance",
    memory: true,
    longTermMemory: true,
    entityMemory: true
)

Performance Best Practices

Embeddings can be expensive. Optimize usage:
// Cache embeddings
private var embeddingCache: [String: [Double]] = [:]

func getEmbedding(for text: String) async throws -> [Double] {
    if let cached = embeddingCache[text] {
        return cached
    }

    let embedding = try await embeddingModel.embed(text: text)
    embeddingCache[text] = embedding
    return embedding
}

// Use efficient embedding models
let config = MemoryConfiguration(
    embeddingModel: "text-embedding-3-small"  // Cheaper than large
)

// Batch embedding requests
let embeddings = try await embeddingModel.embedBatch(texts: texts)
Reduce memory retrieval latency:
// Retrieve fewer items
let results = try await memory.retrieve(
    query: query,
    limit: 5,  // Top 5 instead of 10
    threshold: 0.8  // Higher threshold = fewer results
)

// Use caching for frequent queries
let cachedMemory = CachedMemoryStorage(underlying: memory)

// Pre-load critical memories at startup
let criticalKeys = ["user_preferences", "system_config"]
for key in criticalKeys {
    _ = try await memory.get(key: key)  // Warms cache
}
Properly manage memory throughout lifecycle:
// Initialize
let orbit = try await Orbit.create(
    name: "Workflow",
    agents: agents,
    tasks: tasks,
    memory: true,
    longTermMemory: true,
    memoryConfig: config
)

// Execute
let output = try await orbit.run()
// Memory automatically persisted

// Cleanup when done
if temporaryOrbit {
    try await orbit.memory?.clear()
    try await orbit.memory?.deletePersistedFiles()
}

// Periodic maintenance
Task {
    while isRunning {
        try await Task.sleep(nanoseconds: 3600 * 1_000_000_000)  // 1 hour
        try await memory.compress()  // Compress old memories
        try await memory.prune()     // Remove stale items
    }
}

Security and Privacy Best Practices

Memory contains sensitive data. Implement appropriate security measures.
// 1. Secure persistence paths
let config = MemoryConfiguration(
    persistencePath: "./memory/encrypted"  // Use encrypted storage
)

// 2. Implement data retention policies
let retentionConfig = MemoryConfiguration(
    maxMemoryItems: 100,
    pruneOldItems: true,  // Auto-delete old data
    // Custom: Delete after 30 days
)

// 3. Filter sensitive data
final class SecureMemoryStorage: MemoryStorage {
    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        // Filter sensitive information
        let filtered = filterSensitiveData(value)

        // Encrypt before storage
        let encrypted = try encrypt(filtered)

        try await underlying.store(
            key: key,
            value: encrypted,
            metadata: metadata
        )
    }

    private func filterSensitiveData(_ text: String) -> String {
        var filtered = text

        // Remove credit card numbers
        filtered = filtered.replacingOccurrences(
            of: #"\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}"#,
            with: "[REDACTED]",
            options: .regularExpression
        )

        // Remove email addresses
        filtered = filtered.replacingOccurrences(
            of: #"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}"#,
            with: "[EMAIL]",
            options: [.regularExpression, .caseInsensitive]
        )

        // Remove SSN
        filtered = filtered.replacingOccurrences(
            of: #"\d{3}-\d{2}-\d{4}"#,
            with: "[SSN]",
            options: .regularExpression
        )

        return filtered
    }
}

// 4. Implement access controls
final class AccessControlledMemory: MemoryStorage {
    func retrieve(
        query: String,
        limit: Int,
        threshold: Double?
    ) async throws -> [MemoryItem] {
        // Check permissions
        guard await hasReadPermission() else {
            throw MemoryError.accessDenied
        }

        return try await underlying.retrieve(
            query: query,
            limit: limit,
            threshold: threshold
        )
    }
}

Troubleshooting

Common Memory Issues

Symptom: Application uses excessive RAM or disk space.Causes:
  • maxMemoryItems set too high
  • Compression disabled
  • No pruning enabled
  • Large embeddings cached
  • Memory not cleared between sessions
Diagnosis:
// Check memory usage
print("Memory items: \(await memory.count)")
print("Storage size: \(await memory.sizeInBytes)")
print("Cache size: \(embeddingCache.count)")

// Profile memory allocations
let stats = await memory.statistics()
print("Average item size: \(stats.averageItemSize)")
print("Total embeddings: \(stats.embeddingCount)")
Solutions:
// 1. Reduce memory limit
let config = MemoryConfiguration(
    maxMemoryItems: 50  // Down from 100
)

// 2. Enable compression
let config = MemoryConfiguration(
    compressionEnabled: true,
    autoSummarize: true,
    pruneOldItems: true
)

// 3. Clear memory periodically
Task {
    try await Task.sleep(nanoseconds: 3600 * 1_000_000_000)
    try await memory.clear()
}

// 4. Use more efficient embedding model
let config = MemoryConfiguration(
    embeddingModel: "text-embedding-3-small"  // Smaller vectors
)

// 5. Disable memory if not needed
let agent = Agent(
    role: "Worker",
    memory: false  // No memory overhead
)
Symptom: Long-term memory doesn’t persist across sessions.Causes:
  • longTermMemory not enabled
  • Invalid persistencePath
  • Insufficient disk permissions
  • Application crashes before persistence
  • Memory not explicitly persisted
Diagnosis:
// Check configuration
print("Long-term memory enabled: \(agent.longTermMemory)")
print("Persistence path: \(config.persistencePath)")

// Check file system
let path = config.persistencePath
let fileManager = FileManager.default

if !fileManager.fileExists(atPath: path) {
    print("Error: Persistence path doesn't exist")
}

if !fileManager.isWritableFile(atPath: path) {
    print("Error: No write permission")
}

// Check for persisted files
let files = try fileManager.contentsOfDirectory(atPath: path)
print("Persisted files: \(files)")
Solutions:
// 1. Enable long-term memory
let agent = Agent(
    role: "Assistant",
    longTermMemory: true  // Must be enabled
)

// 2. Use valid persistence path
let config = MemoryConfiguration(
    persistencePath: "./memory"  // Valid path
)

// Create directory if needed
try FileManager.default.createDirectory(
    atPath: "./memory",
    withIntermediateDirectories: true
)

// 3. Explicitly persist before exit
let orbit = try await Orbit.create(/*...*/)
let output = try await orbit.run()

// Persist memory
try await orbit.memory?.persist()

// 4. Handle graceful shutdown
signal(SIGINT) { _ in
    Task {
        try await orbit.memory?.persist()
        exit(0)
    }
}
Symptom: Relevant memories not retrieved or irrelevant memories returned.Causes:
  • similarityThreshold too high or too low
  • Poor embedding model
  • Query doesn’t match stored content
  • Insufficient memory items stored
  • Embedding generation issues
Diagnosis:
// Test retrieval with known items
try await memory.store(
    key: "test",
    value: "The user prefers emails in the morning",
    metadata: nil
)

let results = try await memory.retrieve(
    query: "email preferences",
    limit: 10,
    threshold: 0.5  // Lower threshold for testing
)

print("Retrieved \(results.count) items")
for (index, item) in results.enumerated() {
    print("\(index + 1). \(item.key): \(item.value)")
    if let embedding = item.embedding {
        print("   Embedding dimensions: \(embedding.count)")
    }
}
Solutions:
// 1. Adjust similarity threshold
let config = MemoryConfiguration(
    similarityThreshold: 0.7  // Start lower, tune up
)

// Test different thresholds
for threshold in [0.5, 0.6, 0.7, 0.8, 0.9] {
    let results = try await memory.retrieve(
        query: query,
        limit: 5,
        threshold: threshold
    )
    print("Threshold \(threshold): \(results.count) results")
}

// 2. Use better embedding model
let config = MemoryConfiguration(
    embeddingModel: "text-embedding-3-large"  // Higher quality
)

// 3. Improve query formulation
// Bad query: "emails"
// Good query: "user email preferences and notification settings"

// 4. Store more context
try await memory.store(
    key: "email_prefs",
    value: """
    User email preferences:
    - Timing: Morning emails at 8 AM
    - Frequency: Daily summaries
    - Format: HTML with images
    """,  // Richer context for retrieval
    metadata: ["category": "preferences"]
)

// 5. Verify embeddings are generated
if let item = try await memory.get(key: "test"),
   item.embedding == nil {
    print("Warning: No embedding generated")
    // Check embedding model configuration
}
Symptom: Memory inconsistencies or conflicts between agents.Causes:
  • Multiple agents writing to same keys
  • Orbit memory vs agent memory confusion
  • Concurrent writes without coordination
  • Stale memory reads
Diagnosis:
// Log memory operations
final class LoggingMemoryStorage: MemoryStorage {
    func store(
        key: String,
        value: String,
        metadata: [String: String]?
    ) async throws {
        print("[STORE] Key: \(key) by \(currentAgent)")
        try await underlying.store(key: key, value: value, metadata: metadata)
    }

    func get(key: String) async throws -> MemoryItem? {
        let item = try await underlying.get(key: key)
        print("[GET] Key: \(key) by \(currentAgent) - Found: \(item != nil)")
        return item
    }
}
Solutions:
// 1. Use orbit-level memory for shared state
let orbit = try await Orbit.create(
    name: "Shared State",
    agents: agents,
    tasks: tasks,
    memory: true  // Shared memory (no conflicts)
)

// 2. Use unique keys per agent
try await memory.store(
    key: "\(agent.id)_preference",  // Agent-specific
    value: preference,
    metadata: ["agent": agent.role]
)

// 3. Implement versioning
try await memory.store(
    key: "shared_state",
    value: newValue,
    metadata: [
        "version": "\(currentVersion + 1)",
        "updated_by": agent.id,
        "timestamp": "\(Date().timeIntervalSince1970)"
    ]
)

// 4. Use metadata for coordination
if let existing = try await memory.get(key: "resource"),
   existing.metadata["locked_by"] != nil {
    // Resource locked, wait or skip
} else {
    // Acquire lock
    try await memory.update(
        key: "resource",
        value: existing.value,
        metadata: ["locked_by": agent.id]
    )
}
Symptom: Memory storage/retrieval is slow.Causes:
  • Large number of memory items
  • Expensive embedding generation
  • Slow disk I/O
  • No caching
  • Inefficient similarity search
Diagnosis:
// Measure operation times
let storeStart = Date()
try await memory.store(key: "test", value: "data", metadata: nil)
print("Store time: \(Date().timeIntervalSince(storeStart))s")

let retrieveStart = Date()
let results = try await memory.retrieve(query: "test", limit: 10, threshold: 0.7)
print("Retrieve time: \(Date().timeIntervalSince(retrieveStart))s")

// Profile embedding generation
let embedStart = Date()
let embedding = try await embeddingModel.embed(text: "test")
print("Embedding time: \(Date().timeIntervalSince(embedStart))s")
Solutions:
// 1. Implement caching
let cachedMemory = CachedMemoryStorage(underlying: memory)

// 2. Use vector database for large scale
let vectorMemory = VectorMemoryStorage(
    apiKey: apiKey,
    indexName: "fast-memory"
)

// 3. Batch operations
let items = try await memory.retrieveBatch(
    queries: queries,  // Multiple queries at once
    limit: 10
)

// 4. Use faster embedding model
let config = MemoryConfiguration(
    embeddingModel: "text-embedding-3-small"  // Faster
)

// 5. Reduce memory size
let config = MemoryConfiguration(
    maxMemoryItems: 50,  // Smaller = faster
    compressionEnabled: true
)

// 6. Async prefetching
Task.detached {
    // Prefetch likely needed memories
    try await memory.retrieve(
        query: predictedQuery,
        limit: 5,
        threshold: 0.8
    )
}

Debugging Memory

Create a debug utility for memory inspection:
final class MemoryDebugger {
    let memory: MemoryStorage

    init(memory: MemoryStorage) {
        self.memory = memory
    }

    func printAllMemories() async throws {
        print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
        print("Memory Contents")
        print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")

        let count = await memory.count
        print("Total items: \(count)")

        // Retrieve all memories (use broad threshold)
        let allMemories = try await memory.retrieve(
            query: "",
            limit: count,
            threshold: 0.0
        )

        for (index, item) in allMemories.enumerated() {
            print("\n[\(index + 1)] \(item.key)")
            print("   Value: \(item.value)")
            print("   Timestamp: \(item.timestamp)")
            print("   Access count: \(item.accessCount)")
            print("   Last accessed: \(item.lastAccessed)")
            print("   Metadata: \(item.metadata)")
            if let embedding = item.embedding {
                print("   Embedding: \(embedding.count) dimensions")
            }
        }

        print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
    }

    func testRetrieval(query: String) async throws {
        print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
        print("Testing Retrieval: \"\(query)\"")
        print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")

        for threshold in [0.5, 0.6, 0.7, 0.8, 0.9] {
            let results = try await memory.retrieve(
                query: query,
                limit: 10,
                threshold: threshold
            )

            print("\nThreshold \(threshold): \(results.count) results")
            for (index, item) in results.prefix(3).enumerated() {
                print("  \(index + 1). \(item.key): \(item.value.prefix(50))...")
            }
        }

        print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
    }

    func statistics() async -> MemoryStatistics {
        // Calculate and return statistics
    }
}

// Usage
let debugger = MemoryDebugger(memory: orbit.memory!)
try await debugger.printAllMemories()
try await debugger.testRetrieval(query: "user preferences")

Next Steps


Pro Tip: Start with short-term memory only (memory: true). Monitor your agent’s behavior and add long-term or entity memory only when you have a specific need for persistence or entity tracking. Each memory type adds complexity and overhead.