Knowledge Bases in OrbitAI enable agents to access and retrieve information from external documents and data sources. Using Retrieval Augmented Generation (RAG), agents can ground their responses in factual information, answer questions about specific domains, and provide accurate, contextual assistance based on your organization’s knowledge.
RAG-Enabled
Retrieval Augmented Generation for grounded responses
Multi-Format
Support for PDF, Markdown, JSON, and text documents
Semantic Search
Embedding-based retrieval finds relevant information
Automatic
Agents automatically query knowledge during execution
RAG combines the power of LLMs with factual information from your knowledge base. Agents automatically retrieve relevant context from documents and use it to generate accurate, grounded responses.
Semantic Document Search
Using vector embeddings, the knowledge base performs semantic search to find relevant information even when queries don’t exactly match document text. This enables natural language queries over your documents.
Automatic Context Injection
When agents execute tasks, relevant knowledge is automatically retrieved and injected into the LLM context, requiring no manual intervention or query operations.
Multi-Document Synthesis
Agents can synthesize information from multiple documents simultaneously, creating comprehensive answers that draw from your entire knowledge base.
OrbitAI Execution Flow with Knowledge Base ├── User Request │ └── "What are the agent capabilities?" │ ├── Orbit Orchestrator │ └── Routes to appropriate agent │ ├── Agent Execution │ ├── Analyzes task requirements │ ├── Formulates knowledge query │ └── Triggers knowledge retrieval │ ├── Knowledge Base Query │ ├── Embeds query │ ├── Searches vector index │ ├── Retrieves relevant chunks │ └── Returns context │ ├── LLM Processing │ ├── Context: Retrieved knowledge │ ├── Task: User request │ ├── Agent: Role and purpose │ └── Generates response │ └── Response └── "Agents have tools, memory, and knowledge..."
Knowledge retrieval happens automatically during agent execution. You don’t need to manually query the knowledge base—agents do it for you based on task requirements.
let knowledgeAgent = Agent( role: "Knowledge Expert", purpose: "Provide expert answers using company knowledge", context: """ Expert assistant with deep knowledge of: - Company policies - Product specifications - Customer FAQs - Technical documentation """, knowledgeSources: [ // Company policies "./knowledge/policies/employee-handbook.pdf", "./knowledge/policies/code-of-conduct.md", // Product information "./knowledge/products/catalog.json", "./knowledge/products/specifications.pdf", // Support documentation "./knowledge/support/faq.txt", "./knowledge/support/troubleshooting.md" ])
2
Create Tasks
Create tasks that leverage the knowledge base:
Copy
let queryTask = ORTask( description: """ Answer the user's question using information from the knowledge base. Provide accurate, detailed responses with specific references. """, expectedOutput: "Comprehensive answer with source references", agent: knowledgeAgent)
3
Execute
Run the orbit—knowledge is automatically queried:
Copy
let orbit = try await Orbit.create( name: "Knowledge Query System", agents: [knowledgeAgent], tasks: [queryTask], inputs: ["user_query": "What is our return policy?"])let result = try await orbit.run()// Agent automatically queried knowledge base// Response grounded in return policy documentprint(result.output)
// Poor organizationknowledgeSources: [ "./doc1.pdf", "./doc2.pdf", "./doc3.pdf", "./doc4.pdf", "./doc5.pdf" // What are these?]// Good organizationknowledgeSources: [ // Product documentation "./knowledge/products/user-guide.pdf", "./knowledge/products/api-reference.md", // Company policies "./knowledge/policies/employee-handbook.pdf", "./knowledge/policies/security-policy.md", // Support resources "./knowledge/support/faq.txt", "./knowledge/support/troubleshooting.md", // Data resources "./knowledge/data/product-catalog.json", "./knowledge/data/pricing.json"]
Organization Best Practice: Use a clear directory structure that mirrors your knowledge domains. This makes maintenance easier and helps with debugging retrieval issues.
let supportAgent = Agent( role: "Customer Support Agent", purpose: "Help customers with issues using knowledge base", context: "Friendly support agent with comprehensive product knowledge", knowledgeSources: [ // Product documentation "./kb/products/user-manual.pdf", "./kb/products/quick-start-guide.pdf", // Troubleshooting guides "./kb/support/common-issues.md", "./kb/support/error-codes.txt", "./kb/support/troubleshooting-steps.md", // FAQ database "./kb/faq/general-faq.txt", "./kb/faq/technical-faq.md", // Policy documents "./kb/policies/return-policy.pdf", "./kb/policies/warranty-info.pdf" ])let supportTask = ORTask( description: """ Answer the customer's question using the knowledge base. Provide clear, accurate information with specific references. If information is not in knowledge base, say so clearly. """, expectedOutput: "Helpful answer with source references")
Legal Document Analysis
Copy
let legalAgent = Agent( role: "Legal Research Assistant", purpose: "Research legal documents and provide analysis", context: "Legal expert with access to contracts and case law", knowledgeSources: [ // Contracts "./legal/contracts/vendor-agreements/", "./legal/contracts/employment/", // Policies "./legal/policies/data-privacy.pdf", "./legal/policies/terms-of-service.md", // Case references "./legal/cases/precedents.json", "./legal/cases/summaries.md", // Regulations "./legal/regulations/compliance-requirements.pdf" ])
Product Recommendation Engine
Copy
let recommendationAgent = Agent( role: "Product Advisor", purpose: "Recommend products based on customer needs", context: "Expert advisor with complete product knowledge", knowledgeSources: [ // Product catalog "./products/catalog.json", "./products/specifications.pdf", // Reviews and ratings "./products/reviews.json", "./products/customer-feedback.txt", // Comparison guides "./products/comparison-charts.md", "./products/buying-guides.pdf", // Inventory information "./products/availability.json", "./products/pricing.json" ])
Technical Documentation Assistant
Copy
let techDocsAgent = Agent( role: "Technical Writer Assistant", purpose: "Help with technical documentation queries", context: "Expert in API documentation and technical writing", knowledgeSources: [ // API documentation "./docs/api/rest-api.md", "./docs/api/graphql-api.md", "./docs/api/webhooks.md", // SDK documentation "./docs/sdks/swift-sdk.md", "./docs/sdks/python-sdk.md", // Architecture guides "./docs/architecture/system-design.pdf", "./docs/architecture/deployment.md", // Code examples "./docs/examples/quickstart.md", "./docs/examples/tutorials/", // Changelog "./docs/changelog.md" ])
Healthcare Information System
Copy
let healthcareAgent = Agent( role: "Medical Information Assistant", purpose: "Provide medical information from approved sources", context: "Medical assistant with access to clinical guidelines", knowledgeSources: [ // Clinical guidelines "./medical/guidelines/treatment-protocols.pdf", "./medical/guidelines/diagnostic-criteria.md", // Drug information "./medical/drugs/formulary.json", "./medical/drugs/interactions.pdf", // Procedures "./medical/procedures/standard-procedures.md", "./medical/procedures/safety-protocols.pdf", // Patient education "./medical/education/condition-guides.pdf", "./medical/education/preventive-care.md" ])
Medical Disclaimer: Healthcare applications require careful validation and should not replace professional medical advice. Always ensure compliance with healthcare regulations (HIPAA, etc.).
Combine knowledge bases with memory systems for powerful agents:
Copy
let hybridAgent = Agent( role: "Adaptive Assistant", purpose: "Provide personalized help using both knowledge and memory", context: """ Intelligent assistant that: - Uses knowledge base for factual information - Uses memory for user preferences and history - Combines both for personalized, accurate responses """, // Memory for user interactions memory: true, longTermMemory: true, // Knowledge base for factual information knowledgeSources: [ "./knowledge/product-docs.pdf", "./knowledge/company-info.md" ])
How it works:
Copy
User Query: "What are the features of Product X?" ↓Agent Processing: ├── Memory Check: "User previously asked about Product X pricing" ├── Knowledge Query: "Product X features from documentation" └── Synthesis: Personalized response combining both ↓Response: "Product X has features A, B, C (from knowledge base). Based on your previous interest in pricing (from memory), you might also want to know that it's available at..."
Use Case: Personalized Support
Use Case: Learning System
Use Case: Sales Assistant
Copy
let personalizedSupport = Agent( role: "Personal Support Agent", purpose: "Provide personalized customer support", context: "Support agent with knowledge and memory", // Remember customer history memory: true, longTermMemory: true, // Access support knowledge knowledgeSources: [ "./support/faq.txt", "./support/troubleshooting.md" ])// First interaction// User: "How do I reset my password?"// Agent uses: Knowledge base for procedure// Agent stores: User asked about password reset// Second interaction (later)// User: "I'm still having issues"// Agent uses: Memory (knows about password reset)// + Knowledge base (troubleshooting steps)// Response: Contextual help for password reset issues
Combine knowledge bases with tools for action-oriented agents:
Copy
let actionableAgent = Agent( role: "Executive Assistant", purpose: "Answer questions and take actions", context: "Assistant with knowledge and capabilities to act", // Knowledge for information knowledgeSources: [ "./calendar-policies.md", "./company-contacts.json" ], // Tools for actions tools: [ "apple_calendar", // Create calendar events "send_email", // Send emails "web_search" // Search for info not in KB ])let task = ORTask( description: """ Check the company contacts in the knowledge base for John's email, then send him an email about the meeting using the send_email tool, and create a calendar event using the calendar tool based on the meeting policies in the knowledge base. """, expectedOutput: "Confirmation of email sent and event created")
Agent workflow:
Query knowledge base: Find John’s email in contacts
Query knowledge base: Check meeting policies for defaults
// Low threshold (0.6-0.7): Broad retrieval// More results, some may be less relevantlet broadConfig = KnowledgeConfiguration( similarityThreshold: 0.65)// Medium threshold (0.75-0.8): Balanced// Good balance of recall and precisionlet balancedConfig = KnowledgeConfiguration( similarityThreshold: 0.75 // Recommended)// High threshold (0.85-0.95): Precise// Fewer results, high relevancelet preciseConfig = KnowledgeConfiguration( similarityThreshold: 0.90)
Testing approach:
Copy
// Test different thresholdsfor threshold in [0.6, 0.7, 0.75, 0.8, 0.85, 0.9] { let results = try await kb.query( query: testQuery, threshold: threshold ) print("Threshold \(threshold): \(results.count) results") // Evaluate quality and adjust}
Limit Result Count
Retrieve optimal number of results:
Copy
// Too few (1-2): May miss relevant infolet tooFew = try await kb.query(query: query, limit: 1)// Optimal (3-10): Good context without overloadlet optimal = try await kb.query(query: query, limit: 5)// Too many (20+): Context overload, slowerlet tooMany = try await kb.query(query: query, limit: 25)
Guidelines:
Quick answers: 3-5 results
Comprehensive analysis: 5-10 results
Research tasks: 10-20 results
Monitor context window: Don’t exceed LLM limits
Query Formulation
Formulate effective queries:Poor queries:
Copy
"product" // Too vague"x" // Too short"aslkdjf" // Nonsense
Good queries:
Copy
"product features and specifications""return policy for damaged items""API authentication methods"
Best practice:
Copy
// Let agents formulate queries naturallylet task = ORTask( description: """ Answer the user's question about our return policy. Be specific about damaged items versus change of mind. """, expectedOutput: "Detailed return policy explanation")// Agent will formulate appropriate knowledge query
Caching Strategies
Cache frequently accessed knowledge:
Copy
final class CachedKnowledgeBase { private var queryCache: [String: [KnowledgeResult]] = [:] private let cacheExpiry: TimeInterval = 3600 // 1 hour func query( query: String, limit: Int, threshold: Double ) async throws -> [KnowledgeResult] { // Check cache if let cached = queryCache[query] { return Array(cached.prefix(limit)) } // Query knowledge base let results = try await performQuery( query: query, limit: limit, threshold: threshold ) // Cache results queryCache[query] = results return results } // Periodic cache cleanup func cleanupCache() { // Remove old entries }}
# Store knowledge in version controlgit add knowledge/git commit -m "Update product documentation"# Tag knowledge versionsgit tag -a kb-v1.2 -m "Knowledge base v1.2"
Benefits:
Track document changes
Rollback if needed
Coordinate with code releases
Validation
Validate knowledge base setup:
Copy
func validateKnowledgeBase() async throws { // Check files exist for source in knowledgeSources { guard FileManager.default.fileExists( atPath: source ) else { throw ValidationError.fileNotFound(source) } } // Test retrieval let testQuery = "test query" let results = try await kb.query( query: testQuery, limit: 1 ) guard !results.isEmpty else { throw ValidationError.noResults } print("✓ Knowledge base validated")}
// Test chunk sizeslet chunkSizes = [256, 512, 1024, 2048]for size in chunkSizes { let config = KnowledgeConfiguration(chunkSize: size) let kb = try await KnowledgeBase( sources: testSources, config: config ) // Measure retrieval quality let results = try await kb.query(query: testQuery) print("Chunk size \(size): \(results.count) results") // Evaluate quality manually}
Guidelines:
Small (256-512): Precise retrieval, technical docs
Medium (512-1024): Balanced, general use
Large (1024-2048): Broader context, narratives
2
Tune Retrieval Parameters
Optimize retrieval for your use case:
Copy
let config = KnowledgeConfiguration( retrievalLimit: 5, // Start small similarityThreshold: 0.75, // Adjust based on quality rerankingEnabled: true // Enable for better results)
Performance tips:
Lower retrievalLimit = faster, may miss information
Higher similarityThreshold = fewer but better results
Enable reranking for quality, disable for speed
3
Implement Caching
Cache frequently accessed queries:
Copy
let config = KnowledgeConfiguration( enableCache: true, cacheExpiry: 3600, // 1 hour cacheStrategy: .lru, // Least Recently Used maxCacheSize: 1000 // Max cached queries)
Cache strategies:
LRU: Good for varied queries
LFU: Good for repeated queries
TTL: Good for time-sensitive data
4
Choose Efficient Embedding Model
Balance quality vs. performance:
Copy
// Development: Fast and cheapembeddingModel: "text-embedding-3-small"// Production: BalancedembeddingModel: "text-embedding-ada-002"// High-quality: Best resultsembeddingModel: "text-embedding-3-large"
5
Batch Processing
Process documents in batches:
Copy
let sources = // ... large list of sourceslet batchSize = 10for batch in sources.chunked(into: batchSize) { try await knowledgeBase.addSourcesBatch(batch) // Process in manageable chunks}
6
Lazy Loading
Load documents on-demand:
Copy
let config = KnowledgeConfiguration( loadingStrategy: .lazy, // Load when needed preloadPriority: [ // Preload critical docs "./critical-docs.pdf" ])
Complete implementation of a knowledge-powered support system:
Copy
import OrbitAI// Configure knowledge baselet supportKB = KnowledgeConfiguration( chunkSize: 512, retrievalLimit: 5, similarityThreshold: 0.75, rerankingEnabled: true, enableCache: true)// Create support agentlet supportAgent = Agent( role: "Customer Support Specialist", purpose: """ Provide accurate customer support using the knowledge base. Always cite sources and provide helpful, friendly responses. """, context: """ Expert support agent with access to: - Product documentation - Troubleshooting guides - FAQ database - Return policies Guidelines: - Always check knowledge base before responding - Provide specific references to documentation - Escalate if information not in knowledge base - Be friendly and empathetic """, knowledgeSources: [ "./kb/products/user-manual.pdf", "./kb/support/troubleshooting.md", "./kb/support/faq.txt", "./kb/policies/returns.pdf", "./kb/policies/warranty.pdf" ], knowledgeConfig: supportKB, memory: true, // Remember conversation tools: [ "create_support_ticket", "check_order_status", "send_email" ])// Create taskslet analyzeQuery = ORTask( description: """ Analyze the customer's question and determine: 1. What information they need 2. Which knowledge sources are relevant 3. If tools are needed (order lookup, ticket creation) """, expectedOutput: "Analysis of customer needs")let provideAnswer = ORTask( description: """ Using the knowledge base and any tool results: 1. Answer the customer's question accurately 2. Cite specific documentation sources 3. Provide step-by-step instructions if needed 4. Offer additional relevant information """, expectedOutput: "Complete answer with citations")let followUp = ORTask( description: """ Based on the answer provided: 1. Check if question was fully answered 2. Suggest related resources 3. Ask if customer needs further assistance 4. Create support ticket if needed """, expectedOutput: "Follow-up message")// Create orbitlet supportOrbit = try await Orbit.create( name: "Customer Support System", agents: [supportAgent], tasks: [analyzeQuery, provideAnswer, followUp], process: .sequential, verbose: true)// Handle customer inquiryfunc handleCustomerInquiry(_ inquiry: String) async throws -> String { let result = try await supportOrbit.run( inputs: [ "customer_inquiry": inquiry, "timestamp": Date().description ] ) return result.output}// Example usagelet response = try await handleCustomerInquiry( "How do I reset my password? I've tried the forgot password link but didn't receive an email.")print(response)// Output includes:// - Steps from user manual// - Troubleshooting tips from knowledge base// - Offer to check email settings// - Create support ticket if issue persists
// Medical knowledge configurationlet medicalKBConfig = KnowledgeConfiguration( chunkSize: 768, retrievalLimit: 5, similarityThreshold: 0.85, // High precision for medical info rerankingEnabled: true, enableCache: false, // Don't cache sensitive data extractMetadata: true)// Secure medical knowledge baselet secureMe dicalKB = SecureKnowledgeBase()// Medical information agentlet medicalAgent = Agent( role: "Medical Information Specialist", purpose: "Provide evidence-based medical information", context: """ Medical information specialist with access to: - Clinical guidelines - Treatment protocols - Drug formulary - Patient education materials IMPORTANT: - Only provide information from approved sources - Include proper disclaimers - Never diagnose or prescribe - Refer to healthcare providers when appropriate - Maintain HIPAA compliance """, knowledgeSources: [ "./medical/guidelines/treatment-protocols.pdf", "./medical/guidelines/diagnostic-criteria.pdf", "./medical/drugs/formulary.json", "./medical/procedures/standard-procedures.md", "./medical/education/patient-guides.pdf" ], knowledgeConfig: medicalKBConfig, memory: false, // No persistent memory (privacy) tools: [ "check_drug_interactions", "search_medical_literature" ])// Add medical disclaimerlet disclaimerTask = ORTask( description: """ Add appropriate medical disclaimers: - Information is for educational purposes only - Not a substitute for professional medical advice - Consult healthcare provider for medical decisions """, expectedOutput: "Response with disclaimer")let medicalOrbit = try await Orbit.create( name: "Medical Information System", agents: [medicalAgent], tasks: [disclaimerTask], process: .sequential, memory: false // HIPAA compliance)
Medical Applications: Healthcare applications must comply with regulations (HIPAA, GDPR, etc.). This example is for educational purposes. Consult legal and compliance experts before deploying medical AI systems.
Pro Tip: Start with a small, well-organized knowledge base (5-10 essential documents) and expand based on retrieval gaps. Monitor which queries return poor results and add targeted documents to fill those gaps.