Repository avatar
Security
v1.0.0
active

repo-intel

io.github.nirholas/repo-intel

Analyze repos of any size - security scanning code analysis monorepo support

Documentation

Lyra Intel

Complete Intelligence Infrastructure Engine for Massive-Scale Codebase Analysis

Python 3.9+ MIT License Docker Ready Kubernetes Ready Open Source

πŸ“š Full Documentation | Quick Start | Use Cases | API Reference

Analyze codebases 10-100x faster with AI-powered insights, security scanning, and semantic search.

⚑ Active Development

Lyra Intel is actively being enhanced with improvements daily. The core platform is production-ready and being used in enterprise deployments. Thank you for your contributions! πŸ™Œ

Why Lyra Intel?

Most code analysis tools force a choice: automation at the cost of understanding, or manual inspection with no scale.

Lyra Intel is built on a different principle: Give developers and security teams the intelligence they need to make informed decisions at scale.

You get:

  • βœ… Complete visibility - Understand your entire codebase, not just highlighted issues
  • βœ… AI-powered insights - Get context and explanations, not just lists of problems
  • βœ… Security you control - Run locally or in your cloud, with no data leaving your infrastructure
  • βœ… Scale without sacrifice - Analyze 1 million lines or 1 billion lines with the same ease
  • βœ… Open source - Full transparency, no vendor lock-in, customize for your needs

Perfect for teams that want to own their code intelligence.

What is Lyra Intel?

Lyra Intel is a comprehensive, production-ready intelligence platform designed to understand, secure, and improve codebases of any size - from small projects to enterprise monorepos with millions of lines of code.

Unlike traditional linters or SonarQube-style tools, Lyra Intel combines:

  • Deep code analysis (AST parsing, dependency graphs, complexity metrics)
  • AI-powered insights (OpenAI, Anthropic, or local models)
  • Semantic code search (ML-powered search beyond keywords)
  • Security scanning (secrets, OWASP, CVE detection)
  • Knowledge graphs (understand relationships in your code)
  • Forensic analysis (find dead code, document gaps, technical debt)

Why You Need Lyra Intel

For Security Teams:

  • Automatically find hardcoded secrets, SQL injection risks, OWASP vulnerabilities
  • Track security across massive codebases without manual scanning
  • Generate compliance reports (SOC2, HIPAA, PCI-DSS ready)

For Development Teams:

  • Understand unfamiliar codebases in hours, not weeks
  • Find dead code and technical debt before they become problems
  • Make data-driven architectural decisions
  • Detect complex bugs that static analysis misses

For Engineering Leaders:

  • Quantify code quality and technical debt
  • Track metrics across teams and projects
  • Plan migrations and upgrades with confidence
  • Reduce time spent on code reviews

What You Can Do

With 70+ specialized components, Lyra Intel enables:

GoalWhat Lyra Intel DoesTime Saved
Secure a legacy codebaseScan for vulnerabilities, create remediation planWeeks β†’ Hours
Onboard new developersBuild searchable knowledge base, find examplesDays β†’ Hours
Plan a framework upgradeAnalyze impact, generate step-by-step migration planMonths β†’ Days
Understand technical debtQuantify debt, track trends, prioritize fixesOngoing β†’ Automated
Review pull requestsAI-powered insights + security checks + complexity analysis30 min β†’ 5 min
Find security issuesScan for 50+ vulnerability patterns in real-timeManual β†’ Automated

See real-world use cases β†’

πŸš€ Features

Lyra Intel includes 70+ specialized components organized by capability:

View All Features (70+ Components)

Core Analysis - Understand Your Code

  • πŸ“ File Crawler - Parallel directory traversal with streaming for memory efficiency. Process millions of files without memory issues.
  • πŸ“œ Git Collector - Complete commit history, blame analysis, contributor stats. Understand who changed what and when.
  • πŸ” AST Analyzer - Multi-language syntax tree parsing (Python, JS/TS, Go, Rust, Java, C++, C#, Ruby, PHP). Get accurate code structure.
  • πŸ”— Dependency Mapper - Build complete dependency graphs with circular detection. Understand your architecture.
  • ⚠️ Pattern Detector - Find code smells, anti-patterns, security issues. Detect problems before they become expensive.

Scalability - From Laptop to Enterprise

  • πŸ–₯️ Local Mode - Single machine analysis for development. No setup needed, runs instantly on your machine.
  • 🌐 Distributed Mode - Multi-worker processing for larger codebases. Scale analysis to 100K+ files efficiently.
  • ☁️ Cloud Massive Mode - Auto-scaling cloud infrastructure (AWS, GCP, Azure). Analyze monorepos with millions of files.

Storage Options - Flexibility for Any Scale

  • SQLite - Local development and small projects. Built-in, no dependencies.
  • PostgreSQL - Production deployments. Reliable, proven, scalable.
  • BigQuery - Massive-scale analytics. Query 1M+ analysis results instantly.
  • Cache Layer - Memory, File, Redis backends with TTL/LRU eviction. Speed up repeated analyses.

πŸ” Security - Find Vulnerabilities Before They Become Breaches

  • Security Scanner - OWASP Top 10, hardcoded secrets, SQL injection detection. Scan 50+ vulnerability patterns.
  • Vulnerability Database - Track known CVEs and advisories. Stay updated on emerging threats.
  • Custom Rules - Define custom security patterns. Enforce your organization's security standards.

πŸ€– AI Integration - Get Smarter Insights

  • AI Analyzer - Code explanation, bug detection, refactoring suggestions. Understand complex code instantly.
  • Multiple Providers - OpenAI (GPT-4/3.5), Anthropic (Claude), or Local (Ollama/llama.cpp). Choose what fits your workflow.
  • Cost Effective - Local models for free analysis, or cloud models for maximum accuracy.

πŸ“Š Visualization & Reports - Communicate Results

  • Graph Generator - Export to D3.js, Mermaid, Graphviz DOT. Visualize dependencies and architecture.
  • Report Generator - Executive, Technical, Security, Architecture reports. Different reports for different audiences.
  • Web Dashboard - Interactive D3.js/Cytoscape visualization. Explore your codebase visually.

🌐 API & Enterprise Features

  • REST API Server - 15+ endpoints for integration. Build on top of Lyra Intel.
  • Authentication - API Key, JWT, OAuth 2.0 (SSO), LDAP support. Secure access control.
  • RBAC - Role-based access control. Manage permissions across your team.
  • Rate Limiting - Protect your infrastructure. Scale safely.

πŸ”¬ Forensic Analysis - Find Hidden Problems

  • Forensic Analyzer - Code↔doc bidirectional mapping. Find documentation gaps automatically.
  • Dead Code Detector - Find unused functions, classes, imports. Clean up your codebase.
  • Complexity Analyzer - Cyclomatic, Cognitive, Halstead metrics. Identify problematic code.

πŸ“‹ More Capabilities

  • Code Generation - AI-powered function/class/API generation with custom templates
  • Diff & Impact Analysis - Understand what changed and why it matters
  • Migration Planning - Plan framework/version upgrades with step-by-step guidance
  • Code Profiling - Detect N+1 queries, blocking I/O, inefficient algorithms
  • Schema Analysis - Database schema analysis from ORM models
  • Documentation Generator - Auto-generate API docs and changelogs
  • Workflow Engine - Define and execute multi-step analysis pipelines

πŸ” Auto-Discovery Pipeline (NEW)

  • GitHub Scanner - Automatically discover new MCP crypto tools from GitHub
  • AI Tool Analyzer - Extract tool definitions using AI/pattern matching
  • Security Scanner - Scan discovered tools for vulnerabilities
  • Registry Submitter - Submit approved tools to the Lyra Registry
  • Daily Automation - GitHub Actions workflow for continuous discovery

See Discovery Documentation β†’

πŸ“š Complete Documentation

Lyra Intel includes comprehensive documentation covering every aspect of the platform:

Core Documentation

  • πŸ“– FEATURES.md - Detailed feature documentation with code examples for:

    • Semantic Search (ML-powered code search)
    • SSO Integration (OAuth 2.0, SAML 2.0, LDAP)
    • Language Parsers (C++, C#, Ruby, PHP)
    • Plugin System
    • IDE Extensions (VS Code, JetBrains)
    • CI/CD Integrations (GitLab, Bitbucket, GitHub Actions)
    • Export Formats (PDF, SARIF, Excel, CSV)
    • WebSocket Streaming
    • Interactive CLI
    • Web Dashboard
    • Monitoring & Metrics (Prometheus, Grafana)
  • πŸ’» EXAMPLES.md - Working code examples for:

    • Quick start (60-second analysis)
    • Core analysis workflows
    • Semantic search usage
    • SSO setup and configuration
    • Language-specific parsing
    • Custom plugin development
    • IDE extension installation
    • CI/CD pipeline integration
    • Real-time WebSocket streaming
    • Monitoring setup
    • Complete end-to-end workflows
  • πŸ—οΈ ARCHITECTURE.md - Technical architecture documentation:

    • System overview and design
    • Core component architecture
    • Data flow diagrams
    • Module organization
    • Extension points
    • Deployment architectures (single server, Kubernetes, AWS)
    • Performance & scalability
    • Security architecture
    • Technology stack
  • πŸ”Œ API.md - Complete REST API reference

  • πŸš€ DEPLOYMENT.md - Deployment guides (Docker, Kubernetes, AWS)

  • πŸ“œ openapi.yaml - OpenAPI 3.0 specification

Real-World Workflows

  • πŸ’Ό USE_CASES.md - Practical workflows and best practices:
    • Securing legacy codebases
    • Pre-commit code quality gates
    • CI/CD security pipelines
    • Code review assistance
    • Monorepo migration planning
    • Technical debt tracking
    • Building team knowledge bases
    • Integration patterns
    • Performance optimization tips

Getting Started Guides

Quick Start (5 Minutes)

Get up and running in just a few commands. No complex setup needed.

# 1. Clone the repository
git clone https://github.com/nirholas/lyra-intel.git
cd lyra-intel

# 2. Install (requires Python 3.9+)
pip install -e .

# 3. Quick scan - see what Lyra Intel finds in 30 seconds
python cli.py scan /path/to/any/code

# 4. Full analysis - comprehensive report
python cli.py analyze /path/to/code --output ./results.json

# 5. View results
cat results.json | jq .  # Pretty print the JSON

# 6. (Optional) Start the web dashboard
python launch_dashboard.py
# Then visit http://localhost:8080

What to Expect

After running scan, you'll see:

βœ… Analyzing repository...
πŸ“Š Files analyzed: 156
πŸ“ˆ Total functions: 1,247
⚠️  Issues found: 43
πŸ” Security findings: 5

Running analyze produces detailed JSON with:

  • Metrics: Line counts, complexity, test coverage
  • Security: Vulnerabilities, secrets detection
  • Dependencies: Import relationships, circular deps
  • Patterns: Code smells, anti-patterns
  • Git history: Commit stats, contributors

See more quick examples β†’

πŸ’Ό Common Use Cases

Real teams use Lyra Intel for:

πŸ”’ Security Teams

"I need to scan our 500K LOC codebase for vulnerabilities"

  • Secure a Legacy Codebase - Full audit in 30 min
  • Automatic CI/CD security gates
  • Pre-commit hooks that block insecure code
  • Regular scheduled security scans

πŸ‘¨β€πŸ’» Development Teams

"New developer is joining - how do we onboard them on 200K lines of code?"

  • Build a Team Knowledge Base - Semantic search over your codebase
  • Find similar code patterns
  • Understand architecture through visualization
  • Track technical debt

πŸ—οΈ Platform Teams

"We need to upgrade from Node 14 to Node 18 - is it safe?"

  • Plan a Monorepo Migration - Step-by-step migration plan
  • Analyze impact across all packages
  • Identify breaking changes
  • Estimate effort per package

πŸ“Š Engineering Leads

"Is our code quality improving or getting worse?"

  • Track Technical Debt - Monthly trend tracking
  • Visualize metrics over time
  • Prioritize what to fix first
  • Show data-driven reports to management

πŸ” Code Review

"Reviews are taking too long - 30 min per PR"

See more use cases β†’

πŸ€– MCP Integration (Claude & LLMs)

Use Lyra Intel directly from Claude, Claude Code, or any MCP-compatible LLM.

Quick Setup

# Claude Code - one command
npx lyra-intel-mcp

# Claude Desktop - add to config
{
  "mcpServers": {
    "lyra-intel": {
      "command": "npx",
      "args": ["-y", "lyra-intel-mcp"]
    }
  }
}

Available MCP Tools

ToolDescription
analyze-codebaseComprehensive code analysis with AST, dependencies, metrics
search-codeML-powered semantic code search
get-complexityCyclomatic, cognitive, and Halstead complexity
get-security-issuesSecurity vulnerabilities, secrets, compliance
discovery-scan-githubFind new MCP crypto tools on GitHub
discovery-analyze-repoExtract MCP tool definitions from repos
discovery-run-pipelineFull discovery + analysis + submission

Example Prompts

"Analyze my project at ~/code/myapp for security issues"
"Search for authentication patterns in the codebase"
"Scan GitHub for new MCP crypto tools from the last 7 days"
"Run the discovery pipeline and submit approved tools"

See full MCP documentation β†’

πŸ›οΈ Architecture

  • AI-Powered Code Review - Automated insights in 30 seconds
  • Security analysis
  • Complexity warnings
  • AI suggestions for improvements

πŸ‘‰ See 7 complete workflows with code examples β†’

Architecture

lyra-intel/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ core/           # Main engine orchestration
β”‚   β”œβ”€β”€ collectors/     # Data collection (files, git)
β”‚   β”œβ”€β”€ analyzers/      # Code analysis (AST, dependencies, patterns)
β”‚   β”œβ”€β”€ storage/        # Database and persistence
β”‚   β”œβ”€β”€ agents/         # Multi-agent system
β”‚   β”œβ”€β”€ search/         # Code and semantic search
β”‚   β”œβ”€β”€ query/          # Natural language queries
β”‚   β”œβ”€β”€ visualizers/    # Graph generation
β”‚   β”œβ”€β”€ reports/        # Report generation
β”‚   β”œβ”€β”€ web/            # Web dashboard
β”‚   β”œβ”€β”€ api/            # REST API server
β”‚   β”œβ”€β”€ auth/           # Authentication and authorization
β”‚   β”œβ”€β”€ plugins/        # Plugin system
β”‚   β”œβ”€β”€ ai/             # AI integration
β”‚   β”œβ”€β”€ metrics/        # Metrics collection
β”‚   β”œβ”€β”€ events/         # Event system
β”‚   β”œβ”€β”€ notifications/  # Notifications and alerts
β”‚   β”œβ”€β”€ forensics/      # Forensic analysis
β”‚   β”œβ”€β”€ cache/          # Caching layer
β”‚   β”œβ”€β”€ pipeline/       # Streaming pipeline
β”‚   β”œβ”€β”€ testing/        # Testing infrastructure
β”‚   β”œβ”€β”€ knowledge/      # Knowledge graph system
β”‚   β”œβ”€β”€ diff/           # Diff and impact analysis
β”‚   β”œβ”€β”€ generation/     # Code generation
β”‚   β”œβ”€β”€ security/       # Security scanning
β”‚   β”œβ”€β”€ migration/      # Migration planning
β”‚   β”œβ”€β”€ profiler/       # Performance profiling
β”‚   β”œβ”€β”€ schema/         # Schema analysis
β”‚   β”œβ”€β”€ docgen/         # Documentation generation
β”‚   β”œβ”€β”€ integrations/   # External integrations
β”‚   └── workflow/       # Workflow engine
β”œβ”€β”€ config/             # Configuration files
β”œβ”€β”€ scripts/            # Utility scripts
β”œβ”€β”€ Dockerfile          # Container build
β”œβ”€β”€ docker-compose.yml  # Multi-service deployment
└── cli.py              # Command-line interface

Processing Modes

Local Mode

Best for development and small repositories:

from src import LyraIntelEngine, EngineConfig, ProcessingMode

config = EngineConfig(mode=ProcessingMode.LOCAL, max_workers=8)
engine = LyraIntelEngine(config)
result = await engine.analyze_repository("/path/to/repo")

Distributed Mode

For larger codebases with multiple workers:

config = EngineConfig(
    mode=ProcessingMode.DISTRIBUTED,
    max_workers=50,
)

Cloud Massive Mode

For enterprise-scale analysis:

config = EngineConfig(
    mode=ProcessingMode.CLOUD_MASSIVE,
    cloud_provider="aws",
    cloud_region="us-east-1",
    max_cloud_workers=1000,
)

Analysis Results

The engine produces comprehensive analysis including:

  • File metrics: Total files, sizes, line counts by extension
  • Code structure: Functions, classes, methods with complexity scores
  • Dependencies: Import/export relationships, circular dependencies
  • Git history: Commits, authors, change frequency
  • Patterns: Code smells, anti-patterns, security issues

Results are stored in SQLite (or your configured backend) and can be exported as JSON.

Cloud Support

Lyra Intel is designed to leverage cloud resources efficiently:

ProviderInstance TypesSpot SupportOptimization
AWSEC2, Lambda, ECSβœ… Supported~70% savings
GCPCompute Engine, Cloud Runβœ… Supported~70% savings
AzureVMs, Functionsβœ… Supported~70% savings

Auto-scaling and cost optimization features included.

How Lyra Intel Compares

FeatureLyra IntelSonarQubeSnykGitHub Advanced Security
Open Sourceβœ… MIT❌ Commercial❌ Proprietary⚠️ Limited
Semantic Code Searchβœ… ML-powered❌ No❌ No❌ No
AI Integrationβœ… Any provider❌ No❌ Noβœ… GitHub Copilot only
Monorepo Supportβœ… Up to 1M files⚠️ Limitedβœ… Goodβœ… Good
Self-Hostedβœ… Full⚠️ Enterprise only⚠️ Limitedβœ… GitHub-hosted
Costβœ… FreeπŸ’°πŸ’°πŸ’°πŸ’°πŸ’°πŸ’°
Knowledge Graphβœ… Automatic❌ No❌ No❌ No
Forensic Analysisβœ… Dead code, debt⚠️ Basic❌ No⚠️ Basic
Migration Planningβœ… Automated steps❌ No❌ No❌ No
Multi-Languageβœ… 10+ languagesβœ… Many⚠️ JS/Python focusβœ… Many
Real-time Dashboardβœ… React UIβœ… Yesβœ… Yesβœ… Yes

Bottom line: Lyra Intel is best for teams that want deep code understanding + AI insights + full control, all open source.

πŸ›£οΈ Roadmap

βœ… Phase 1: Core Platform (Complete)

  • Complete analysis engine with 70+ components
  • Multi-language parsing (10+ languages)
  • Dependency graphing and pattern detection
  • Git history analysis and forensics
  • Security scanning (50+ patterns)
  • AI integration (OpenAI, Anthropic, Ollama)

βœ… Phase 2: Enterprise Features (Complete)

  • REST API with 15+ endpoints
  • Web dashboard with interactive visualizations
  • Knowledge graph and semantic search
  • RBAC, SSO, and authentication
  • Code generation and migration planning
  • IDE plugins (VS Code, JetBrains)

βœ… Phase 3: Scale & Performance (Complete)

  • Distributed analysis for 100K+ files
  • Cloud massive mode (AWS/GCP/Azure auto-scaling)
  • Real-time streaming analysis
  • ML-based code review
  • Performance profiling and optimization
  • Schema analysis and workflow engine

πŸ”„ Phase 4: Advanced Features (In Progress)

  • Enhanced ML models for code understanding
  • Custom model fine-tuning
  • Advanced compliance reporting
  • Real-time dashboard improvements
  • Performance benchmarking suite

πŸ“… Future Phases

  • Automated remediation suggestions
  • Integration with more CI/CD platforms
  • Mobile app for dashboard access
  • Advanced visualization options
  • Community plugin marketplace

πŸ“ˆ Metrics & Monitoring

Access metrics at:

  • Prometheus: http://localhost:9090
  • Grafana: http://localhost:3000
  • API Health: http://localhost:8080/api/v1/health

Key metrics:

  • lyra_intel_requests_total - Total API requests
  • lyra_intel_analysis_duration_seconds - Analysis performance
  • lyra_intel_ai_tokens_total - AI usage tracking
  • lyra_intel_cache_hits_total - Cache efficiency

🀝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

πŸ› Troubleshooting

Common issues and solutions:

Database connection failed

docker-compose restart postgres
docker-compose logs postgres

High memory usage

# Reduce workers
export WORKERS=4

# Increase memory limit
docker-compose up -d --scale api=1 --memory 4g

API rate limit

# Increase rate limits in config
export RATE_LIMIT_PER_MINUTE=1000

See DEPLOYMENT.md for comprehensive troubleshooting.

πŸ“Š Project Status

  • βœ… Core analysis engine
  • βœ… Multi-language support (10+ languages)
  • βœ… AI integrations (OpenAI, Anthropic, Ollama)
  • βœ… Security scanning (OWASP, secrets, dependencies)
  • βœ… Export formats (JSON, HTML, PDF, SARIF, CSV, Excel)
  • βœ… IDE plugins (VS Code, JetBrains)
  • βœ… Platform integrations (GitHub, GitLab, Bitbucket)
  • βœ… Cloud deployment (AWS, Kubernetes, Docker)
  • βœ… Real-time streaming (WebSocket)
  • βœ… Web dashboard (React)
  • βœ… Monitoring (Prometheus, Grafana)
  • βœ… Enterprise features (SSO, RBAC, audit logs)

🌟 Show Your Support

If you find Lyra Intel helpful, consider:

  • ⭐ Star this repository - It helps others discover the project
  • πŸ› Report issues - Help us improve by reporting bugs
  • πŸ’‘ Share ideas - Suggest features and improvements
  • 🀝 Contribute - See CONTRIBUTING.md for guidelines
  • πŸ“’ Spread the word - Share with your team and community

Every star, contribution, and mention helps grow the community!

πŸ™ Acknowledgments

Built with amazing open-source tools:

πŸ“§ Contact & Support


Made with ❀️ for developers, security teams, and engineering leaders.

⬆ Back to Top

πŸ“„ License

MIT License - see LICENSE file for details.


Made with❀️by nich | Follow me on X.com