Architecture Guide
This guide explains how Nancy Brain works internally and how its components interact.
Overview
Nancy Brain is designed as a modular system for turning GitHub repositories into AI-searchable knowledge bases. The architecture consists of several key components that work together to provide efficient semantic search capabilities.
System Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Interfaces │ │ Core Services │ │ Data Layer │
├─────────────────┤ ├─────────────────┤ ├─────────────────┤
│ • CLI Commands │ │ • RAG Service │ │ • Text Store │
│ • MCP Server │───▶│ • Search Engine │───▶│ • Embeddings │
│ • HTTP API │ │ • Registry │ │ • Metadata │
│ • Web UI │ │ • Builder │ │ • Weights │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ External │
│ Sources │
├─────────────────┤
│ • GitHub Repos │
│ • Local Files │
│ • PDFs │
│ • Web Content │
└─────────────────┘
Core Components
1. RAG Service (rag_core/service.py
)
The central orchestrator that coordinates all operations:
- Search Operations: Semantic search across the knowledge base
- Document Retrieval: Fetch specific document passages
- Weight Management: Apply dynamic relevance weights
- Health Monitoring: System status and diagnostics
class RAGService:
def __init__(self, embeddings_path, config_path, weights_path):
self.search = SearchEngine(embeddings_path, weights_path)
self.store = TextStore(embeddings_path)
self.registry = DocumentRegistry(config_path)
2. Search Engine (rag_core/search.py
)
Handles semantic search and relevance scoring:
- Embedding Generation: Convert text to vector representations
- Similarity Search: Find relevant documents using cosine similarity
- Score Adjustment: Apply weights and filters to results
- Caching: Optimize repeated queries
Key Features: - Multi-model support (sentence-transformers, OpenAI, etc.) - Dynamic weight application - Threshold filtering - Extension-based relevance boosting
3. Text Store (rag_core/store.py
)
Manages document storage and retrieval:
- Document Indexing: Store and index text content
- Line-level Access: Retrieve specific line ranges
- Content Hashing: Track document changes
- Efficient Storage: Optimized file organization
4. Document Registry (rag_core/registry.py
)
Tracks metadata and configuration:
- Repository Configuration: Manage source repositories
- Document Metadata: Track file types, URLs, checksums
- Toolkit Classification: Organize by project/domain
- Change Detection: Monitor for updates
5. Knowledge Base Builder (scripts/build_knowledge_base.py
)
Processes source repositories into searchable format:
- Repository Cloning: Fetch latest code from GitHub
- Content Extraction: Process various file types
- Embedding Generation: Create vector representations
- Index Building: Construct searchable indexes
Data Flow
1. Build Process
- Clone: Download repositories to local storage
- Extract: Read files and extract text content
- Process: Clean and normalize text
- Embed: Generate vector embeddings
- Index: Build searchable indexes
- Store: Save to knowledge base
2. Search Process
- Embed: Convert query to vector representation
- Search: Find similar document vectors
- Score: Calculate relevance scores
- Filter: Apply threshold and toolkit filters
- Rank: Sort by adjusted relevance
- Results: Return formatted results
3. Retrieval Process
- Locate: Find document in text store
- Extract: Get requested line range
- Format: Add metadata and formatting
- Return: Provide structured response
Interface Layers
CLI Interface (nancy_brain/cli.py
)
Command-line interface for direct interaction:
nancy-brain search
- Search the knowledge basenancy-brain explore
- Browse document treenancy-brain build
- Rebuild knowledge base
MCP Server (connectors/mcp_server/
)
Model Context Protocol server for AI integration:
- Provides tools for LLMs to search and retrieve
- Integrates with Claude Desktop and VS Code
- Supports real-time knowledge base access
HTTP API (connectors/http_api/
)
REST API for programmatic access:
- RESTful endpoints for all operations
- JSON request/response format
- Rate limiting and error handling
Web UI (nancy_brain/admin_ui.py
)
Browser-based interface for management:
- Visual search interface
- Knowledge base statistics
- Configuration management
Configuration System
Repository Configuration (config/repositories.yml
)
Defines source repositories:
microlensing_tools:
- name: MulensModel
url: https://github.com/rpoleski/MulensModel.git
- name: pyLIMA
url: https://github.com/ebachelet/pyLIMA.git
general_tools:
- name: numpy
url: https://github.com/numpy/numpy.git
Weights Configuration (config/weights.yaml
)
Controls relevance scoring:
extensions:
".py": 1.2
".md": 1.0
".rst": 0.9
".txt": 0.8
toolkits:
microlensing_tools: 1.5
general_tools: 1.0
Performance Considerations
Memory Management
- Lazy Loading: Load embeddings on demand
- Caching: Cache frequently accessed documents
- Batch Processing: Process multiple queries efficiently
Storage Optimization
- Compressed Embeddings: Reduce storage requirements
- Incremental Updates: Only rebuild changed content
- Efficient Indexing: Optimized data structures
Query Optimization
- Parallel Processing: Concurrent search operations
- Result Caching: Cache common query results
- Smart Filtering: Early filtering to reduce computation
Extensibility
Custom Embeddings
Add new embedding models:
Custom Processors
Add support for new file types:
class CustomProcessor:
def process_file(self, filepath):
# Extract text from custom format
return text_content
Custom Interfaces
Add new interaction methods:
class CustomInterface:
def __init__(self, rag_service):
self.rag_service = rag_service
def handle_request(self, request):
# Process custom request format
return response
Security Considerations
Access Control
- API authentication and authorization
- Resource usage limits
- Input validation and sanitization
Data Protection
- Secure storage of sensitive content
- Encryption of embeddings
- Audit logging of access
Network Security
- HTTPS for API endpoints
- Rate limiting and DDoS protection
- Secure credential management
Monitoring and Debugging
Logging
Comprehensive logging throughout the system:
- Query performance metrics
- Error tracking and debugging
- Usage analytics
Health Checks
Built-in health monitoring:
- System status endpoints
- Performance metrics
- Resource utilization
Debugging Tools
Development and troubleshooting utilities:
- Verbose logging modes
- Query explanation tools
- Performance profiling