Skip to main content

Overview

OrbitAI provides a sophisticated Large Language Model integration layer that enables seamless interaction with multiple AI providers through a unified interface.

Multi-Provider Support

OpenAI, Anthropic, and extensible architecture for custom providers

Intelligent Routing

Automatic provider selection based on latency and success rate

Unified Interface

Single API for all LLM operations regardless of provider

Advanced Features

Streaming, tool calling, structured output, and caching

Performance Monitoring

Built-in metrics tracking and health monitoring

Type Safety

Full Swift type safety with compile-time guarantees

Key Benefits

Switch between providers without code changes. Use the same API regardless of whether you’re calling OpenAI, Anthropic, or a custom provider.
Built-in redundancy and error recovery. If a provider fails or is rate-limited, requests automatically route to backup providers.
Route to optimal providers based on cost and performance criteria. Balance quality with budget constraints automatically.
Simple, consistent API across all providers with full Swift type safety and comprehensive error handling.

Architecture

OrbitAI’s LLM system consists of several key components working together:
LLMManager (Central Coordinator)
    ├── Provider Registry
    ├── Request Routing
    ├── Response Caching
    ├── Metrics Collection
    └── Health Monitoring

LLMProviders (Implementation Layer)
    ├── OpenAIProvider
    ├── AnthropicProvider
    └── Custom Providers

Models & Schemas (Type Safety)
    ├── OpenAIModel enum
    ├── AnthropicModel enum
    ├── ChatMessage
    ├── LLMRequest/Response
    └── ToolSchema

LLMManager: The Central Hub

The LLMManager actor serves as the central coordination point for all LLM operations:
  • Provider Management: Register, configure, and manage multiple LLM providers
  • Request Routing: Intelligently route requests to optimal providers
  • Response Caching: Optional in-memory caching for improved performance
  • Metrics & Health: Track performance and monitor provider health
  • Thread Safety: Actor-based design ensures safe concurrent access

Setting Up LLM Providers

1

Configure Environment Variables

Set up your API keys as environment variables:
.env
OPENAI_API_KEY=your-openai-api-key-here
ANTHROPIC_API_KEY=your-anthropic-api-key-here
OPENAI_ORG=your-organization-id  # Optional
Never commit your API keys to version control. Always use environment variables or secure key storage.
2

Initialize LLM Manager

Use factory methods for quick setup:
import OrbitAI

// OpenAI with default configuration
let manager = try await LLMManager.createWithOpenAI(
    apiKey: "your-openai-key",
    model: .gpt4o
)

// OpenAI with advanced configuration
let advancedManager = try await LLMManager.createWithOpenAI(
    apiKey: "your-openai-key",
    model: .gpt4o,
    cacheTTL: 300,           // 5 minutes
    cacheCapacity: 256,      // Max cached responses
    enableCaching: true,
    defaultTimeout: 30,      // 30 seconds
    enableMetrics: true,
    organization: "org_123",
    maxAttempts: 3,          // Retry attempts
    logRetries: true
)
3

Register Providers

For more control, manually register providers:
let manager = LLMManager(
    enableMetrics: true,
    enableCaching: true,
    defaultTimeout: 30
)

// Register OpenAI provider
let openAIProvider = try OpenAIProvider(
    model: .gpt4o,
    apiKey: "your-openai-key"
)
await manager.registerProvider(openAIProvider, asDefault: true)

// Register Anthropic provider as fallback
let anthropicProvider = AnthropicProvider(
    model: .claude35Sonnet,
    apiKey: "your-anthropic-key"
)
await manager.registerProvider(anthropicProvider)
4

Environment-Based Setup

Use automatic environment configuration:
// Automatically uses OPENAI_API_KEY, OPENAI_ORG, etc.
let provider = try OpenAIProvider.fromEnvironment(
    model: .gpt4o
)
let manager = LLMManager()
await manager.registerProvider(provider, asDefault: true)

Provider Configuration

  • OpenAI
  • Anthropic
  • Multi-Provider

Standard Configuration

let openAIProvider = try OpenAIProvider(
    model: .gpt4o,                    // Model selection
    apiKey: "your-api-key",
    apiBase: "https://api.openai.com/v1",
    organization: "org_123",           // Optional
    maxAttempts: 3,                   // Retry attempts
    logRetries: false,                // Log retry attempts
    allowInsecureAPIBase: false       // Allow HTTP endpoints
)

Available Models

// Latest and most capable models
.gpt5          // GPT-5 (when available)
.gpt41         // GPT-4.1
.gpt4o         // GPT-4o (recommended)
.gpt4oMini     // Cost-effective variant

// Reasoning models
.gptO3         // GPT-o3
.gptO3Mini     // GPT-o3 Mini
.gptO3Pro      // GPT-o3 Pro
.gptO4Mini     // GPT-o4 Mini

// Shortcuts
.default       // Currently gpt4oMini
.highQuality   // Currently gpt5
.economic      // Currently gptO3Mini

AI Proxy Configuration

For using AI proxy services:
let proxyProvider = try OpenAIProvider(
    model: .gpt4o,
    apiKey: "",  // Empty if using proxy auth
    aiProxyPartialKey: "your-partial-key",
    clientID: "your-client-id"
)

Integration with Agents

Agent-Level Configuration

Agents can be configured to use specific LLM providers:
// Using typed provider ID (recommended)
let researchAgent = Agent(
    role: "Senior Research Analyst",
    purpose: "Conduct comprehensive market research",
    context: "Expert in data analysis with 10 years experience",
    llmID: .openAI,  // Type-safe provider selection
    tools: ["web_search", "data_analysis"],
    verbose: true
)
The llmID parameter uses type-safe enums to ensure you’re referencing valid providers configured in your system.

Agent Factory

Use AgentFactory for pre-configured agents:
// Research agent with specific LLM
let researcher = AgentFactory.createResearchAgent(
    goal: "Analyze AI market trends for Q4 2024",
    llmID: .openAI,
    tools: ["web_search", "data_analysis", "report_generator"]
)

// Writing agent with different LLM for creative tasks
let writer = AgentFactory.createWritingAgent(
    goal: "Write compelling product descriptions",
    llmID: .anthropic,  // Anthropic for creative writing
    tools: ["content_optimization"]
)

Orbit-Level Integration

Orbits automatically handle LLM provider setup:
let orbit = try await Orbit.create(
    name: "Content Creation Pipeline",
    agents: [researcher, writer, editor],
    tasks: [researchTask, writingTask, editingTask],
    process: .sequential,
    verbose: true
)

// Orbit automatically creates and configures LLMManager
// Uses environment variables: OPENAI_API_KEY, ANTHROPIC_API_KEY
For manual configuration:
let orbit = try await Orbit.create(
    agents: [agents],
    tasks: [tasks]
)

// Configure custom LLM provider
let customProvider = try OpenAIProvider(
    model: .gpt4o,
    apiKey: customKey,
    apiBase: "https://custom-endpoint.com/v1"
)
await orbit.configureLLMProvider(customProvider, asDefault: true)

Custom Endpoints & Local Models

Configure OrbitAI to use Ollama for local models:
let ollamaProvider = try OpenAIProvider(
    model: .gpt4o,  // Use any model enum
    apiKey: "ollama", // Ollama doesn't require real API key
    apiBase: "http://localhost:11434/v1",
    allowInsecureAPIBase: true  // Allow HTTP for local development
)

await manager.registerProvider(ollamaProvider)
Ollama provides a local, OpenAI-compatible API endpoint, making integration seamless.
let lmStudioProvider = try OpenAIProvider(
    model: .gpt4oMini,
    apiKey: "lm-studio",
    apiBase: "http://localhost:1234/v1",
    allowInsecureAPIBase: true
)
Implement the LLMProvider protocol for custom APIs:
public actor CustomModelProvider: LLMProvider, LLMProviderIdentifiable {
    public let providerName: String = "CustomModel"
    public let providerID: LLMProviderID = .custom
    public let modelName: String
    public let maxTokens: Int = 4096
    public let supportsStreaming: Bool = true
    public let supportsToolCalling: Bool = false

    private let endpoint: String
    private let apiKey: String

    public init(modelName: String, endpoint: String, apiKey: String) {
        self.modelName = modelName
        self.endpoint = endpoint
        self.apiKey = apiKey
    }

    public func generateCompletion(
        messages: [ChatMessage],
        temperature: Double?,
        maxTokens: Int?,
        tools: [ToolSchema]?
    ) async throws -> LLMResponse {
        // Implement your custom API integration
        // Convert ChatMessages to your API format
        // Make HTTP request to your endpoint
        // Parse response and return LLMResponse
    }
}

Best Practices

Provider Selection

GPT-4o

Best for: Complex reasoning, analysis, high-quality outputUse when accuracy and capability are paramount

GPT-4o Mini

Best for: Simple tasks, summaries, classificationCost-effective for high-volume operations

Claude 3.5 Sonnet

Best for: Creative writing, long-form contentExcellent for content generation

Claude 3.5 Haiku

Best for: Fast responses, simple queriesOptimized for speed and efficiency

Intelligent Routing

Use ProviderSelectionCriteria for automatic provider selection:
let highQualityCriteria = ProviderSelectionCriteria(
    preferredModel: OpenAIModel.gpt4o,
    maxLatency: 2.0,
    minSuccessRate: 0.95,
    requiresStreaming: false,
    requiresTools: true
)

let economicCriteria = ProviderSelectionCriteria(
    preferredModel: OpenAIModel.gpt4oMini,
    maxCostPerToken: 0.001,
    requiresStreaming: true
)

// Route based on task requirements
let response = try await manager.routeRequest(
    request: request,
    criteria: highQualityCriteria
)

Context Management

Effective System Messages

let agent = Agent(
    role: "Financial Analyst",
    purpose: "Analyze financial data and provide investment insights",
    context: """
    You are a senior financial analyst with 15 years of experience in equity research.
    Your expertise includes:
    - Financial statement analysis
    - Market trend identification
    - Risk assessment
    - Investment recommendations

    Always provide specific, actionable insights backed by data.
    Use appropriate financial terminology and explain complex concepts clearly.
    """,
    temperature: 0.3  // Lower temperature for analytical tasks
)
Use lower temperatures (0.1-0.3) for analytical tasks and higher temperatures (0.7-0.9) for creative tasks.

Performance Optimization

Caching Strategy

let manager = LLMManager(
    enableCaching: true,
    cacheTTL: 300,      // 5 minutes for stable responses
    cacheCapacity: 512  // Balance memory usage
)

// Cache-friendly requests
let request = LLMRequest(
    messages: normalizedMessages,  // Consistent formatting
    temperature: 0.0,  // Deterministic for better cache hits
    bypassCache: false
)

Streaming for Real-Time

let stream = try await manager.generateStreamingCompletion(request: request)

for try await chunk in stream {
    if let content = chunk.content {
        // Update UI incrementally
        await MainActor.run {
            textView.text += content
        }
    }

    if let usage = chunk.usageMetrics {
        // Track token usage
        updateMetrics(usage)
    }
}

Advanced Features

Force LLMs to return structured JSON responses:
let productSchema = JSONSchema(
    type: .object,
    properties: [
        "name": JSONSchema(type: .string, description: "Product name"),
        "price": JSONSchema(type: .number, description: "Price in USD"),
        "category": JSONSchema(type: .string, description: "Product category"),
        "features": JSONSchema(
            type: .array,
            items: .init(value: JSONSchema(type: .string)),
            description: "List of key features"
        )
    ],
    required: ["name", "price", "category"]
)

let response = try await manager.generateStructuredCompletion(
    request: LLMRequest(messages: [.user("Create a product for wireless headphones")]),
    schema: productSchema,
    preferNative: true
)
Define tools for LLM use:
let webSearchTool = ToolSchema(
    function: FunctionSchema(
        name: "web_search",
        description: "Search the web for current information",
        parameters: JSONSchema(
            type: .object,
            properties: [
                "query": JSONSchema(
                    type: .string,
                    description: "Search query"
                ),
                "num_results": JSONSchema(
                    type: .integer,
                    description: "Number of results (1-10)"
                )
            ],
            required: ["query"]
        )
    )
)

let requestWithTools = LLMRequest(
    messages: [
        .system("You can search the web for current information."),
        .user("What are the latest developments in AI this week?")
    ],
    tools: [webSearchTool]
)
Process multiple requests concurrently:
func processBatch(_ inputs: [String]) async throws -> [String] {
    return try await withThrowingTaskGroup(of: (Int, String).self) { group in
        for (index, input) in inputs.enumerated() {
            group.addTask {
                let request = LLMRequest(messages: [.user(input)])
                let response = try await manager.generateCompletion(request: request)
                return (index, response.content)
            }
        }

        var results = Array(repeating: "", count: inputs.count)
        for try await (index, result) in group {
            results[index] = result
        }
        return results
    }
}
Multi-turn conversations with context:
actor ConversationManager {
    private var messages: [ChatMessage] = []
    private let manager: LLMManager
    private let maxContextTokens: Int

    init(manager: LLMManager, maxContextTokens: Int = 8000) {
        self.manager = manager
        self.maxContextTokens = maxContextTokens
    }

    func addUserMessage(_ content: String) async throws -> String {
        messages.append(.user(content))

        // Prune context if needed
        await pruneContextIfNeeded()

        let response = try await manager.generateCompletion(
            request: LLMRequest(messages: messages)
        )

        messages.append(.assistant(response.content))
        return response.content
    }

    private func pruneContextIfNeeded() async {
        let tokenCount = try? await manager.countTokens(
            request: LLMRequest(messages: messages)
        )

        if let count = tokenCount, count > maxContextTokens {
            // Keep system message and recent messages
            let systemMessages = messages.filter { $0.role == .system }
            let recentMessages = Array(messages.suffix(10))
            messages = systemMessages + recentMessages
        }
    }
}

Troubleshooting

Problem: OrbitAIError.configuration("OPENAI_API_KEY missing")Solutions:
// Verify environment variable
print(ProcessInfo.processInfo.environment["OPENAI_API_KEY"] ?? "Not set")

// Pass API key explicitly
let provider = try OpenAIProvider(
    model: .gpt4o,
    apiKey: "your-actual-api-key-here"
)
Verify your API key format. OpenAI keys start with sk-, Anthropic keys start with sk-ant-.
Problem: OrbitAIError.llmRateLimitExceededSolutions:
// Implement backoff strategy
do {
    let response = try await manager.generateCompletion(request: request)
} catch let error as OrbitAIError {
    if case .llmRateLimitExceeded(let message) = error {
        // Parse Retry-After header
        if let retryAfter = parseRetryAfter(message) {
            try await Task.sleep(for: .seconds(retryAfter))
        }
    }
}
Monitor rate limits:
let metrics = manager.getProviderMetrics()
for (provider, metric) in metrics {
    print("\(provider): \(metric.successRate)% success rate")
}
Problem: Requests timing outSolutions:
// Increase timeout for complex requests
let response = try await manager.generateCompletion(
    request: request,
    timeout: 60  // Increase from default 30s
)

// Use streaming for long responses
let stream = try await manager.generateStreamingCompletion(request: request)
Diagnostics:
// Check provider metrics
let metrics = await manager.getProviderMetrics(id: .openAI)
print("Average latency: \(metrics?.averageLatency ?? 0)s")
print("P95 latency: \(metrics?.p95Latency ?? 0)s")
print("Success rate: \((metrics?.successRate ?? 0) * 100)%")

// Check cache performance
print("Cache size: \(manager.cacheSize())")
Solutions:
  • Clear stale cache entries
  • Use consistent message formatting
  • Consider provider switching for better performance

Security Considerations

API Key Management

Never hardcode API keys in your source code. Use environment variables or secure key storage solutions.
import Security

class SecureAPIKeyManager {
    private let service = "com.yourapp.orbitai"

    func storeAPIKey(_ key: String, for provider: String) -> Bool {
        let data = key.data(using: .utf8)!

        let query: [String: Any] = [
            kSecClass as String: kSecClassGenericPassword,
            kSecAttrService as String: service,
            kSecAttrAccount as String: provider,
            kSecValueData as String: data
        ]

        SecItemDelete(query as CFDictionary)
        let status = SecItemAdd(query as CFDictionary, nil)
        return status == errSecSuccess
    }

    func retrieveAPIKey(for provider: String) -> String? {
        let query: [String: Any] = [
            kSecClass as String: kSecClassGenericPassword,
            kSecAttrService as String: service,
            kSecAttrAccount as String: provider,
            kSecReturnData as String: true
        ]

        var result: AnyObject?
        let status = SecItemCopyMatching(query as CFDictionary, &result)

        guard status == errSecSuccess,
              let data = result as? Data,
              let key = String(data: data, encoding: .utf8) else {
            return nil
        }

        return key
    }
}

Request Sanitization

extension ChatMessage {
    func sanitized() -> ChatMessage {
        let sanitizedContent = content
            .replacingOccurrences(of: #"<script.*?</script>"#, with: "", options: .regularExpression)
            .replacingOccurrences(of: #"<.*?>"#, with: "", options: .regularExpression)
            .trimmingCharacters(in: .whitespacesAndNewlines)

        return ChatMessage(role: role, content: sanitizedContent)
    }
}

Next Steps

For additional support and advanced use cases, consult the GitHub Discussions or check out the Issue Tracker.