Skip to main content

Overview

Telemetry in OrbitAI provides real-time monitoring, performance metrics, and usage analytics across agents, tasks, and orbits. Track token consumption, execution times, API calls, tool usage, and costs to optimize performance, manage budgets, and debug issues effectively.

Usage Metrics

Track token usage, API calls, and request success rates

Performance Monitoring

Monitor execution times and identify bottlenecks

Cost Tracking

Calculate and monitor LLM and API costs

Tool Analytics

Measure tool usage, execution time, and success rates

Real-Time Updates

Monitor live execution status and progress

Custom Integration

Integrate with external analytics and monitoring systems

Key Capabilities

Telemetry data is collected at multiple levels—orbit, task, agent, and tool—providing both aggregated overview metrics and granular detail for deep analysis.
Metrics are collected automatically during execution without any manual instrumentation. All usage data, timing information, and performance metrics are captured seamlessly.
Telemetry is enabled by default and requires no setup. Access comprehensive metrics immediately after execution through simple API calls.
Integrate custom telemetry managers to export data to your preferred analytics platform, logging service, or monitoring dashboard.

Telemetry Architecture

Telemetry Data Flow
    ├── Orbit Level
    │   ├── Total execution time
    │   ├── Aggregated token usage
    │   ├── Total API calls
    │   └── All tasks metrics
    │       │
    │       ├── Task Level
    │       │   ├── Task execution time
    │       │   ├── Task token usage
    │       │   ├── Task API calls
    │       │   ├── Validation results
    │       │   └── Tools used
    │       │       │
    │       │       └── Tool Level
    │       │           ├── Tool name
    │       │           ├── Execution time
    │       │           ├── Success status
    │       │           ├── Input size
    │       │           └── Output size
    │       │
    │       └── Agent Level
    │           ├── Execution count
    │           ├── Average execution time
    │           └── Cumulative metrics

    └── External Integration
        ├── TelemetryManager
        ├── Custom Analytics
        ├── Monitoring Dashboards
        └── Alerting Systems

Usage Metrics

OrbitAI tracks comprehensive usage metrics to help you understand resource consumption and API usage patterns.

UsageMetrics Structure

public struct UsageMetrics: Codable, Sendable {
    public let promptTokens: Int          // Tokens in prompts
    public let completionTokens: Int      // Tokens in responses
    public let totalTokens: Int           // Total token usage
    public let successfulRequests: Int    // Successful API calls
    public let totalRequests: Int         // Total API calls made
}
  • Token Metrics
  • API Call Metrics
  • Access Patterns
Token Usage Tracking
let result = try await orbit.run()
let metrics = result.usageMetrics

print("Token Usage:")
print("  Prompt tokens: \(metrics.promptTokens)")
print("  Completion tokens: \(metrics.completionTokens)")
print("  Total tokens: \(metrics.totalTokens)")
Understanding Token Counts:
  • Prompt tokens: Input sent to LLM (system messages, user input, context, tools)
  • Completion tokens: LLM-generated output (responses, tool calls, reasoning)
  • Total tokens: Sum of prompt and completion tokens
Why it matters:
  • Track API costs (billed per token)
  • Optimize prompt efficiency
  • Monitor context window usage
  • Identify verbose agents

Accessing Usage Metrics

1

Execute Orbit

Run your orbit to generate telemetry data:
let orbit = try await Orbit.create(
    name: "Analytics Workflow",
    agents: agents,
    tasks: tasks
)

let result = try await orbit.run()
2

Access Aggregated Metrics

Get orbit-wide metrics from the output:
let metrics = result.usageMetrics

print("=== Orbit Metrics ===")
print("Total tokens: \(metrics.totalTokens)")
print("Prompt tokens: \(metrics.promptTokens)")
print("Completion tokens: \(metrics.completionTokens)")
print("API calls: \(metrics.totalRequests)")
print("Successful: \(metrics.successfulRequests)")
3

Analyze Per-Task Metrics

Drill down into individual task performance:
for (index, taskOutput) in result.taskOutputs.enumerated() {
    print("\n=== Task \(index + 1) ===")
    print("Description: \(taskOutput.description)")

    if let taskMetrics = taskOutput.usageMetrics {
        print("Tokens: \(taskMetrics.totalTokens)")
        print("Requests: \(taskMetrics.totalRequests)")
        print("Success rate: \(taskMetrics.successfulRequests)/\(taskMetrics.totalRequests)")
    }
}
4

Review Agent Statistics

Check agent-level cumulative metrics:
let agents = await orbit.getAgents()

for agent in agents {
    let agentMetrics = await agent.totalUsageMetrics
    let execCount = await agent.executionCount
    let avgTime = await agent.averageExecutionTime

    print("\n=== \(agent.role) ===")
    print("Executions: \(execCount)")
    print("Avg time: \(String(format: "%.2f", avgTime))s")
    print("Total tokens: \(agentMetrics.totalTokens)")
    print("Avg tokens/exec: \(agentMetrics.totalTokens / max(1, execCount))")
}

Performance Monitoring

Track execution times and identify performance bottlenecks across your agent workflows.

Execution Time Metrics

  • Orbit Execution Time
  • Task Execution Time
  • Agent Performance
Total Workflow Duration
let result = try await orbit.run()

print("Workflow Performance:")
print("  Total execution: \(result.executionTime)s")

// Break down by tasks
var totalTaskTime: TimeInterval = 0
for (index, taskOutput) in result.taskOutputs.enumerated() {
    if let task = orbit.tasks[safe: index],
       let execTime = task.executionTime {
        totalTaskTime += execTime
        print("  Task \(index + 1): \(String(format: "%.2f", execTime))s")
    }
}

// Calculate overhead (orchestration, validation, etc.)
let overhead = result.executionTime - totalTaskTime
print("  Orchestration overhead: \(String(format: "%.2f", overhead))s")
Components:
  • Task execution time (agent processing)
  • Tool execution time
  • Orchestration overhead (task coordination, validation)
  • Sequential vs parallel timing

Identifying Bottlenecks

1

Sort Tasks by Execution Time

Find the slowest tasks:
let result = try await orbit.run()

// Create task timing array
let taskTimings = result.taskOutputs.enumerated().compactMap { (index, output) -> (Int, TimeInterval)? in
    guard let task = orbit.tasks[safe: index],
          let execTime = task.executionTime else {
        return nil
    }
    return (index, execTime)
}

// Sort by execution time (descending)
let sortedByTime = taskTimings.sorted { $0.1 > $1.1 }

print("⚠️ Slowest Tasks:")
for (index, time) in sortedByTime.prefix(5) {
    if let task = orbit.tasks[safe: index] {
        let percentage = (time / result.executionTime) * 100
        print("  \(index + 1). \(task.description)")
        print("     Time: \(String(format: "%.2f", time))s (\(String(format: "%.1f", percentage))% of total)")
    }
}
2

Analyze Tool Performance

Identify slow or failing tools:
var toolStats: [String: (count: Int, totalTime: TimeInterval, failures: Int)] = [:]

for taskOutput in result.taskOutputs {
    for toolUsage in taskOutput.toolsUsed {
        if var stats = toolStats[toolUsage.toolName] {
            stats.count += 1
            stats.totalTime += toolUsage.executionTime
            if !toolUsage.success {
                stats.failures += 1
            }
            toolStats[toolUsage.toolName] = stats
        } else {
            toolStats[toolUsage.toolName] = (
                1,
                toolUsage.executionTime,
                toolUsage.success ? 0 : 1
            )
        }
    }
}

print("\n⚠️ Tool Performance Issues:")
for (tool, stats) in toolStats.sorted(by: { $0.value.totalTime > $1.value.totalTime }) {
    let avgTime = stats.totalTime / Double(stats.count)
    let failureRate = Double(stats.failures) / Double(stats.count) * 100

    if avgTime > 5.0 || failureRate > 10 {
        print("  \(tool):")
        print("    Avg time: \(String(format: "%.2f", avgTime))s")
        print("    Failure rate: \(String(format: "%.1f", failureRate))%")
        print("    Calls: \(stats.count)")
    }
}
3

Calculate Task Efficiency

Compare actual vs expected performance:
for (index, taskOutput) in result.taskOutputs.enumerated() {
    guard let task = orbit.tasks[safe: index],
          let execTime = task.executionTime else {
        continue
    }

    let expectedTime = task.maxExecutionTime ?? 60.0
    let efficiency = (expectedTime / execTime) * 100

    if efficiency < 50 {
        print("⚠️ Task \(index + 1) inefficient:")
        print("   Expected: <\(expectedTime)s")
        print("   Actual: \(String(format: "%.2f", execTime))s")
        print("   Efficiency: \(String(format: "%.0f", efficiency))%")

        // Analyze why
        if let metrics = taskOutput.usageMetrics {
            print("   Tokens: \(metrics.totalTokens)")
            print("   API calls: \(metrics.totalRequests)")
        }
        print("   Tools used: \(taskOutput.toolsUsed.count)")
    }
}

Real-Time Monitoring

Monitor orbit execution in real-time:
import Foundation

// Start orbit asynchronously
Task {
    try await orbit.run()
}

// Monitor while running
while await orbit.isRunning() {
    let status = await orbit.getExecutionStatus()

    print("\r⏳ Progress: \(status.completionPercentage)% ", terminator: "")
    print("| Active: \(status.activeTasks) ", terminator: "")
    print("| Completed: \(status.completedTasks)/\(status.totalTasks) ", terminator: "")
    print("| Failed: \(status.failedTasks)", terminator: "")

    try await Task.sleep(for: .seconds(1))
}

print("\n✅ Complete!")
Execution Status Fields:
  • queuedTasks: Tasks waiting to execute
  • activeTasks: Currently executing tasks
  • completedTasks: Successfully completed tasks
  • failedTasks: Failed tasks
  • totalTasks: Total number of tasks
  • completionPercentage: Progress (0-100)

Tool Analytics

Track tool usage, performance, and success rates across your workflow.

ToolUsage Structure

public struct ToolUsage: Codable, Sendable {
    public let toolName: String           // Name of the tool used
    public let executionTime: TimeInterval // Tool execution duration
    public let success: Bool              // Execution success status
    public let inputSize: Int             // Size of tool input
    public let outputSize: Int            // Size of tool output
}

Tool Performance Analysis

  • Basic Tool Stats
  • Per-Tool Analysis
  • Tool Efficiency
let result = try await orbit.run()

// Collect all tool usages
var allTools: [ToolUsage] = []
for taskOutput in result.taskOutputs {
    allTools.append(contentsOf: taskOutput.toolsUsed)
}

print("Tool Usage Summary:")
print("  Total tool calls: \(allTools.count)")
print("  Unique tools: \(Set(allTools.map { $0.toolName }).count)")
print("  Successful: \(allTools.filter { $0.success }.count)")
print("  Failed: \(allTools.filter { !$0.success }.count)")

// Total time spent in tools
let totalToolTime = allTools.reduce(0) { $0 + $1.executionTime }
print("  Total tool execution time: \(String(format: "%.2f", totalToolTime))s")

Tool Usage Patterns

Identify how tools are being used:
// Which tasks use which tools?
for (index, taskOutput) in result.taskOutputs.enumerated() {
    guard let task = orbit.tasks[safe: index] else { continue }

    if !taskOutput.toolsUsed.isEmpty {
        print("\nTask \(index + 1): \(task.description)")
        print("  Tools: \(taskOutput.toolsUsed.map { $0.toolName }.joined(separator: ", "))")

        // Tool execution sequence
        print("  Execution order:")
        for (i, toolUsage) in taskOutput.toolsUsed.enumerated() {
            let status = toolUsage.success ? "✓" : "✗"
            print("    \(i + 1). \(toolUsage.toolName) (\(String(format: "%.2f", toolUsage.executionTime))s) \(status)")
        }
    }
}

// Tool correlation analysis
print("\n🔗 Tool Correlation:")
print("  (Which tools are often used together?)")

var toolPairs: [String: Int] = [:]
for taskOutput in result.taskOutputs {
    let tools = taskOutput.toolsUsed.map { $0.toolName }
    for i in 0..<tools.count {
        for j in (i+1)..<tools.count {
            let pair = "\(tools[i]) + \(tools[j])"
            toolPairs[pair, default: 0] += 1
        }
    }
}

for (pair, count) in toolPairs.sorted(by: { $0.value > $1.value }).prefix(5) {
    print("  \(pair): \(count) times")
}

Cost Tracking

Calculate and monitor costs associated with LLM usage and external API calls.

LLM Cost Calculation

  • OpenAI Pricing
  • Claude Pricing
  • Multi-Model Costs
Calculate costs for OpenAI models:
func calculateOpenAICost(
    metrics: UsageMetrics,
    model: String
) -> Double {
    // Pricing per 1M tokens (as of 2024)
    let pricing: [String: (input: Double, output: Double)] = [
        "gpt-4o": (2.50, 10.00),
        "gpt-4o-mini": (0.15, 0.60),
        "gpt-4-turbo": (10.00, 30.00),
        "gpt-3.5-turbo": (0.50, 1.50)
    ]

    guard let price = pricing[model] else {
        return 0.0
    }

    let inputCost = Double(metrics.promptTokens) / 1_000_000 * price.input
    let outputCost = Double(metrics.completionTokens) / 1_000_000 * price.output

    return inputCost + outputCost
}

// Usage
let result = try await orbit.run()
let cost = calculateOpenAICost(
    metrics: result.usageMetrics,
    model: "gpt-4o"
)

print("💰 Estimated Cost: $\(String(format: "%.4f", cost))")

Budget Management

Implement cost controls and budget tracking:
final class BudgetTracker {
    let dailyLimit: Double
    let monthlyLimit: Double

    private var dailyCost: Double = 0
    private var monthlyCost: Double = 0
    private var lastResetDate: Date = Date()

    init(dailyLimit: Double, monthlyLimit: Double) {
        self.dailyLimit = dailyLimit
        self.monthlyLimit = monthlyLimit
    }

    func trackExecution(metrics: UsageMetrics, model: String) throws {
        resetIfNeeded()

        let cost = calculateOpenAICost(metrics: metrics, model: model)

        // Check limits
        if dailyCost + cost > dailyLimit {
            throw BudgetError.dailyLimitExceeded(
                current: dailyCost,
                limit: dailyLimit,
                attempted: cost
            )
        }

        if monthlyCost + cost > monthlyLimit {
            throw BudgetError.monthlyLimitExceeded(
                current: monthlyCost,
                limit: monthlyLimit,
                attempted: cost
            )
        }

        // Update tracking
        dailyCost += cost
        monthlyCost += cost

        print("💰 Budget Status:")
        print("  Daily: $\(String(format: "%.2f", dailyCost))/$\(String(format: "%.2f", dailyLimit))")
        print("  Monthly: $\(String(format: "%.2f", monthlyCost))/$\(String(format: "%.2f", monthlyLimit))")
    }

    private func resetIfNeeded() {
        let calendar = Calendar.current
        let now = Date()

        // Reset daily if new day
        if !calendar.isDate(lastResetDate, inSameDayAs: now) {
            dailyCost = 0
        }

        // Reset monthly if new month
        if !calendar.isDate(lastResetDate, equalTo: now, toGranularity: .month) {
            monthlyCost = 0
        }

        lastResetDate = now
    }

    enum BudgetError: Error {
        case dailyLimitExceeded(current: Double, limit: Double, attempted: Double)
        case monthlyLimitExceeded(current: Double, limit: Double, attempted: Double)
    }
}

// Usage
let budgetTracker = BudgetTracker(
    dailyLimit: 10.00,    // $10/day
    monthlyLimit: 200.00  // $200/month
)

do {
    let result = try await orbit.run()
    try budgetTracker.trackExecution(
        metrics: result.usageMetrics,
        model: "gpt-4o"
    )
} catch BudgetTracker.BudgetError.dailyLimitExceeded(let current, let limit, let attempted) {
    print("⚠️ Daily budget exceeded!")
    print("  Current: $\(current)")
    print("  Limit: $\(limit)")
    print("  Attempted: $\(attempted)")
}

Cost Optimization

Strategies to reduce costs:

Optimize Prompts

Reduce token usage with concise prompts:
// Before: Verbose (150 tokens)
context: """
You are a highly skilled and experienced
professional content writer with many years
of expertise in creating engaging content...
"""

// After: Concise (30 tokens)
context: "Expert content writer"

// Savings: 80% fewer tokens

Use Cheaper Models

Choose appropriate model for task complexity:
// Simple tasks: use cheaper model
let simpleAgent = Agent(
    role: "Data Formatter",
    llm: .gpt4oMini  // 94% cheaper
)

// Complex tasks: use premium model
let complexAgent = Agent(
    role: "Strategic Analyst",
    llm: .gpt4o  // Better reasoning
)

Cache Responses

Enable LLM caching for repeated queries:
let llmManager = LLMManager(
    enableCaching: true,
    cacheTTL: 3600  // 1 hour
)

// Repeated queries use cache
// Saves API calls and costs

Batch Processing

Process multiple items in one request:
// Instead of 10 separate calls
for item in items {
    await agent.process(item)  // 10 API calls
}

// Batch process
await agent.processBatch(items)  // 1 API call

Custom Telemetry Integration

Integrate OrbitAI with your existing analytics and monitoring infrastructure.

TelemetryManager Protocol

public protocol TelemetryManager {
    // Lifecycle events
    func orbitStarted(orbitId: String, name: String)
    func orbitCompleted(orbitId: String, output: OrbitOutput)
    func orbitFailed(orbitId: String, error: Error)

    // Task events
    func taskStarted(taskId: String, description: String)
    func taskCompleted(taskId: String, output: TaskOutput)
    func taskFailed(taskId: String, error: Error)

    // Agent events
    func agentExecuted(agentId: String, role: String, metrics: UsageMetrics)

    // Tool events
    func toolInvoked(toolName: String, parameters: [String: Any])
    func toolCompleted(toolName: String, usage: ToolUsage)

    // Custom events
    func logEvent(name: String, properties: [String: Any])
    func logMetric(name: String, value: Double, tags: [String: String])
}

Custom Implementation Example

  • Analytics Integration
  • Logging Integration
  • Metrics Platform
import OrbitAI

final class AnalyticsTelemetryManager: TelemetryManager {
    private let analyticsService: AnalyticsService

    init(analyticsService: AnalyticsService) {
        self.analyticsService = analyticsService
    }

    func orbitStarted(orbitId: String, name: String) {
        analyticsService.track(
            event: "orbit_started",
            properties: [
                "orbit_id": orbitId,
                "orbit_name": name,
                "timestamp": Date().timeIntervalSince1970
            ]
        )
    }

    func orbitCompleted(orbitId: String, output: OrbitOutput) {
        analyticsService.track(
            event: "orbit_completed",
            properties: [
                "orbit_id": orbitId,
                "orbit_name": output.orbitName,
                "execution_time": output.executionTime,
                "total_tokens": output.usageMetrics.totalTokens,
                "total_tasks": output.taskOutputs.count,
                "timestamp": Date().timeIntervalSince1970
            ]
        )

        // Track as metric
        analyticsService.recordMetric(
            name: "orbit_execution_time",
            value: output.executionTime,
            tags: ["orbit_name": output.orbitName]
        )

        analyticsService.recordMetric(
            name: "orbit_token_usage",
            value: Double(output.usageMetrics.totalTokens),
            tags: ["orbit_name": output.orbitName]
        )
    }

    func orbitFailed(orbitId: String, error: Error) {
        analyticsService.track(
            event: "orbit_failed",
            properties: [
                "orbit_id": orbitId,
                "error": error.localizedDescription,
                "timestamp": Date().timeIntervalSince1970
            ]
        )

        // Alert on failures
        analyticsService.incrementCounter(
            "orbit_failures",
            tags: ["error_type": String(describing: type(of: error))]
        )
    }

    func taskCompleted(taskId: String, output: TaskOutput) {
        analyticsService.track(
            event: "task_completed",
            properties: [
                "task_id": taskId,
                "description": output.description,
                "tokens": output.usageMetrics.totalTokens,
                "tools_used": output.toolsUsed.count
            ]
        )
    }

    func toolCompleted(toolName: String, usage: ToolUsage) {
        analyticsService.recordMetric(
            name: "tool_execution_time",
            value: usage.executionTime,
            tags: [
                "tool": toolName,
                "success": String(usage.success)
            ]
        )
    }

    // Implement other protocol methods...
}

// Usage
let analytics = AnalyticsTelemetryManager(
    analyticsService: myAnalyticsService
)

let orbit = try await Orbit.create(
    name: "Monitored Workflow",
    agents: agents,
    tasks: tasks,
    telemetryManager: analytics
)

Step Callbacks

Track execution progress with step callbacks:
let orbit = try await Orbit.create(
    name: "Monitored Workflow",
    agents: agents,
    tasks: tasks,
    stepCallback: "onStepComplete"
)

// Define callback
func onStepComplete(step: ExecutionStep) {
    print("📍 Step completed:")
    print("  Orbit: \(step.orbitId)")
    print("  Task: \(step.taskDescription)")
    print("  Agent: \(step.agentRole)")
    print("  Duration: \(String(format: "%.2f", step.duration))s")
    print("  Progress: \(step.progressPercentage)%")

    // Custom telemetry
    telemetryManager.logEvent(
        name: "step_completed",
        properties: [
            "orbit_id": step.orbitId,
            "task_id": step.taskId,
            "agent_id": step.agentId,
            "duration": step.duration,
            "progress": step.progressPercentage
        ]
    )

    // Update UI/dashboard
    updateDashboard(step: step)
}

Best Practices

Telemetry Configuration

Enable by Default

Always collect telemetry in production:
let orbit = try await Orbit.create(
    name: "Production Workflow",
    agents: agents,
    tasks: tasks,
    usageMetrics: true  // Default: true
)
Why:
  • Debug production issues
  • Track costs
  • Monitor performance
  • Analyze usage patterns

Aggregate Metrics

Use orbit-level metrics for overview:
// Don't iterate tasks for totals
let total = result.usageMetrics.totalTokens

// Instead of
var total = 0
for task in result.taskOutputs {
    total += task.usageMetrics.totalTokens
}
Benefits:
  • Cleaner code
  • Already aggregated
  • No calculation overhead

Archive Metrics

Store telemetry data for historical analysis:
struct ExecutionRecord: Codable {
    let date: Date
    let orbitName: String
    let executionTime: TimeInterval
    let tokens: Int
    let cost: Double
    let tasks: Int
}

func archiveMetrics(_ output: OrbitOutput) {
    let record = ExecutionRecord(
        date: output.completedAt,
        orbitName: output.orbitName,
        executionTime: output.executionTime,
        tokens: output.usageMetrics.totalTokens,
        cost: calculateCost(output.usageMetrics),
        tasks: output.taskOutputs.count
    )

    database.save(record)
}

Set Alerts

Alert on anomalies:
func checkMetrics(_ metrics: UsageMetrics) {
    // Alert on high token usage
    if metrics.totalTokens > 50000 {
        alerting.send(
            "High token usage: \(metrics.totalTokens)"
        )
    }

    // Alert on high failure rate
    let failureRate = 1.0 - (
        Double(metrics.successfulRequests) /
        Double(metrics.totalRequests)
    )

    if failureRate > 0.1 {  // >10% failures
        alerting.send(
            "High failure rate: \(failureRate * 100)%"
        )
    }
}

Performance Monitoring

Measure baseline performance for comparison:
struct PerformanceBaseline {
    let orbitName: String
    let avgExecutionTime: TimeInterval
    let avgTokens: Int
    let avgTasks: Int

    func compare(to output: OrbitOutput) -> PerformanceComparison {
        let timeDelta = output.executionTime - avgExecutionTime
        let tokenDelta = output.usageMetrics.totalTokens - avgTokens

        return PerformanceComparison(
            timeChange: timeDelta,
            timeChangePercent: (timeDelta / avgExecutionTime) * 100,
            tokenChange: tokenDelta,
            tokenChangePercent: Double(tokenDelta) / Double(avgTokens) * 100
        )
    }
}

// Establish baseline
var executions: [OrbitOutput] = []
for _ in 0..<10 {
    let output = try await orbit.run()
    executions.append(output)
}

let baseline = PerformanceBaseline(
    orbitName: orbit.name,
    avgExecutionTime: executions.map { $0.executionTime }.reduce(0, +) / Double(executions.count),
    avgTokens: executions.map { $0.usageMetrics.totalTokens }.reduce(0, +) / executions.count,
    avgTasks: executions[0].taskOutputs.count
)

// Compare new executions
let newOutput = try await orbit.run()
let comparison = baseline.compare(to: newOutput)

if comparison.timeChangePercent > 50 {
    print("⚠️ Execution time increased by \(comparison.timeChangePercent)%")
}
Identify which tasks need optimization:
struct TaskProfile {
    let description: String
    var executions: Int = 0
    var totalTime: TimeInterval = 0
    var totalTokens: Int = 0

    var avgTime: TimeInterval {
        totalTime / Double(max(1, executions))
    }

    var avgTokens: Double {
        Double(totalTokens) / Double(max(1, executions))
    }
}

var taskProfiles: [String: TaskProfile] = [:]

// Track over multiple runs
for _ in 0..<10 {
    let result = try await orbit.run()

    for (index, taskOutput) in result.taskOutputs.enumerated() {
        guard let task = orbit.tasks[safe: index],
              let execTime = task.executionTime else {
            continue
        }

        let key = task.description
        if var profile = taskProfiles[key] {
            profile.executions += 1
            profile.totalTime += execTime
            profile.totalTokens += taskOutput.usageMetrics.totalTokens
            taskProfiles[key] = profile
        } else {
            taskProfiles[key] = TaskProfile(
                description: task.description,
                executions: 1,
                totalTime: execTime,
                totalTokens: taskOutput.usageMetrics.totalTokens
            )
        }
    }
}

// Analyze profiles
print("\n📊 Task Performance Profiles:")
for (_, profile) in taskProfiles.sorted(by: { $0.value.avgTime > $1.value.avgTime }) {
    print("\nTask: \(profile.description)")
    print("  Executions: \(profile.executions)")
    print("  Avg time: \(String(format: "%.2f", profile.avgTime))s")
    print("  Avg tokens: \(Int(profile.avgTokens))")
}

Troubleshooting

Common Issues

Symptom: usageMetrics is nil or has zero values.Causes:
  • Metrics collection disabled
  • Task didn’t execute
  • LLM provider doesn’t return usage data
Diagnosis:
let result = try await orbit.run()

// Check if metrics exist
if result.usageMetrics.totalTokens == 0 {
    print("⚠️ No metrics collected")

    // Check task outputs
    for (index, taskOutput) in result.taskOutputs.enumerated() {
        print("Task \(index): \(taskOutput.usageMetrics.totalTokens) tokens")
    }
}
Solutions:
// 1. Ensure metrics enabled
let orbit = try await Orbit.create(
    name: "Workflow",
    agents: agents,
    tasks: tasks,
    usageMetrics: true  // Explicitly enable
)

// 2. Check task execution
for task in orbit.tasks {
    print("Task status: \(task.status)")
    if task.status == .failed {
        print("  Error: \(task.result?.error ?? "Unknown")")
    }
}

// 3. Verify LLM configuration
let llmManager = LLMManager(
    enableMetrics: true  // Enable LLM metrics
)
Symptom: Token counts don’t match expectations or LLM provider reports.Causes:
  • Different tokenization methods
  • System messages not counted
  • Tool descriptions included/excluded
Diagnosis:
// Compare with manual calculation
let estimatedTokens = estimateTokens(text)
let reportedTokens = metrics.promptTokens

let difference = abs(estimatedTokens - reportedTokens)
let percentDiff = Double(difference) / Double(reportedTokens) * 100

if percentDiff > 10 {
    print("⚠️ Token count mismatch: \(percentDiff)%")
    print("  Estimated: \(estimatedTokens)")
    print("  Reported: \(reportedTokens)")
}

func estimateTokens(_ text: String) -> Int {
    // Rough estimate: ~4 characters per token
    return text.count / 4
}
Solutions:
  • Use LLM provider’s token count (most accurate)
  • Include tool definitions in estimates
  • Account for system messages
  • Use provider’s tokenizer for accuracy
Symptom: Telemetry collection uses excessive memory or CPU.Causes:
  • Storing too much telemetry data in memory
  • Complex analytics calculations
  • Not archiving historical data
Solutions:
// 1. Archive and clear regularly
func archiveAndClear() {
    // Archive to disk/database
    database.archive(telemetryData)

    // Clear in-memory data
    telemetryData.removeAll()
}

// Run periodically
Timer.scheduledTimer(
    withTimeInterval: 3600,  // Every hour
    repeats: true
) { _ in
    archiveAndClear()
}

// 2. Use sampling for high-frequency events
var eventCount = 0

func logEvent(_ event: TelemetryEvent) {
    eventCount += 1

    // Only log every 100th event
    if eventCount % 100 == 0 {
        telemetryManager.logEvent(event)
    }
}

// 3. Disable detailed tracking for production
#if DEBUG
let detailedTracking = true
#else
let detailedTracking = false
#endif
Symptom: Adding telemetry slows down execution.Causes:
  • Synchronous telemetry calls
  • Network I/O to analytics service
  • Complex calculations in callback
Solutions:
// 1. Make telemetry async
final class AsyncTelemetryManager: TelemetryManager {
    private let queue = DispatchQueue(
        label: "com.app.telemetry",
        qos: .utility
    )

    func orbitCompleted(orbitId: String, output: OrbitOutput) {
        // Dispatch to background queue
        queue.async {
            self.sendToAnalytics(output)
        }

        // Don't block main execution
    }

    private func sendToAnalytics(_ output: OrbitOutput) {
        // Network call, calculations, etc.
    }
}

// 2. Batch telemetry events
final class BatchingTelemetryManager: TelemetryManager {
    private var eventBatch: [TelemetryEvent] = []
    private let batchSize = 100

    func logEvent(_ event: TelemetryEvent) {
        eventBatch.append(event)

        if eventBatch.count >= batchSize {
            flushBatch()
        }
    }

    private func flushBatch() {
        Task.detached {
            await self.sendBatch(self.eventBatch)
            self.eventBatch.removeAll()
        }
    }
}

// 3. Use local logging instead of network
final class LocalTelemetryManager: TelemetryManager {
    private let logger = Logger()

    func orbitCompleted(orbitId: String, output: OrbitOutput) {
        // Fast local logging
        logger.info("Orbit completed: \(output.orbitName)")

        // Sync to remote later
        syncQueue.add(output)
    }
}

Next Steps


Pro Tip: Set up automated daily reports that summarize your telemetry data. Track total costs, token usage trends, and performance metrics to catch issues early and optimize continuously.