Research Workflow with Nancy Brain
Build searchable knowledge bases from scientific repositories, papers, and documentation for academic research.
Quick Setup
# Create research project
nancy-brain init my-research
cd my-research
# Configure repositories in config/repositories.yml
research_tools:
- name: astropy
url: https://github.com/astropy/astropy.git
- name: scipy
url: https://github.com/scipy/scipy.git
# Add papers in config/articles.yml
key_papers:
- name: "Important Paper 2024"
url: "https://arxiv.org/pdf/2401.12345.pdf"
description: "Key methodology paper"
# Build knowledge base
nancy-brain build
nancy-brain build --articles-config config/articles.yml
# Start searching
nancy-brain ui # Web interface
nancy-brain search "your research topic"
Core Workflows
Literature Review
# Seed with foundational papers
nancy-brain search "fundamental concepts your-field"
nancy-brain explore --prefix "key_papers"
# Find related work
nancy-brain search "specific methodology"
nancy-brain search "recent developments"
Code Discovery
# Find implementations
nancy-brain search "algorithm implementation"
nancy-brain explore --prefix "research_tools/astropy"
# Compare approaches
nancy-brain search "performance comparison methods"
nancy-brain search "numerical stability"
Method Development
# Research background
nancy-brain search "machine learning your-domain"
nancy-brain search "current limitations"
# Find gaps and opportunities
nancy-brain search "computational bottlenecks"
nancy-brain search "unsolved problems"
Advanced Features
SQL-Like Queries
Nancy Brain supports direct database queries through the txtai backend:
# From scripts or Python integration
results = embeddings.database.search("SELECT id, text FROM txtai WHERE id LIKE 'papers/%'")
results = embeddings.database.search("SELECT * FROM txtai WHERE id = 'specific_document_id'")
Targeted Searches
# Use prefixes for specific collections
nancy-brain explore --prefix "simulation_tools"
nancy-brain explore --prefix "foundational_papers"
# Limit scope and depth
nancy-brain explore --max-depth 2 --max-entries 20
nancy-brain search "specific query" --limit 5
Cross-Domain Research
# Combine concepts
nancy-brain search "machine learning astronomical surveys"
nancy-brain search "Bayesian methods time series"
nancy-brain search "GPU acceleration scientific computing"
Integration Examples
Jupyter Notebooks
import subprocess
def search_kb(query, limit=5):
result = subprocess.run([
'nancy-brain', 'search', query, '--limit', str(limit)
], capture_output=True, text=True)
return result.stdout
# Use in research
background = search_kb("methodology background")
LaTeX Writing
# Generate context for papers
nancy-brain search "survey methodology" --limit 3 > background.txt
nancy-brain search "implementation details" --limit 5 > methods.txt
Research Documentation
Create research logs tracking your queries and findings:
# Research Log
## 2025-01-15: Background Survey
- nancy-brain search "deep learning astronomy"
- Found 3 relevant implementations
- TensorFlow examples in astropy ecosystem
Performance & Collaboration
Efficient Builds
# Incremental updates
nancy-brain build --config config/core-tools.yml
nancy-brain build --articles-config config/papers.yml
# Monitor size
du -sh knowledge_base/embeddings/
Team Sharing
# Version control configurations
git add config/
git commit -m "Research KB configuration"
# Reproducible builds
git clone shared-config-repo
nancy-brain build
Troubleshooting
# No results? Try broader terms
nancy-brain search "general-topic" # before "very-specific-implementation"
# Check what's indexed
nancy-brain explore --max-entries 10
# Performance issues? Use targeted queries
nancy-brain search "specific-term" --limit 3
nancy-brain explore --max-depth 2
Next Steps
- Expand: Add domain-specific repositories and papers
- Customize: Edit
config/weights.yaml
for file type priorities - Automate: Script regular updates with
--force-update
- Integrate: Use MCP server or HTTP API for deeper tool integration
See MCP Integration, HTTP API, and Advanced Configuration for more details.