Syntra Architecture
Overview
Syntra is the central AI teammate for the Goalixa platform β an intelligent orchestration service that manages incidents, automates DevOps workflows, and provides seamless Kubernetes cluster management through a multi-agent system powered by CrewAI.
What Syntra Solves
As a solo developer building Goalixa, I faced three critical challenges:
- Incident Management Overload - Debugging production issues across multiple microservices
- Context Switching - Constantly moving between development, deployment, and operations
- Operational Complexity - Managing Kubernetes clusters, Git repositories, and service health
Syntra addresses these by acting as an always-available DevOps teammate that:
- Investigates incidents automatically
- Manages Kubernetes operations with AI
- Provides a unified CLI interface for all operations
- Learns from past incidents to improve future responses
System Architecture
High-Level Architecture
The system consists of multiple layers:
Client Layer: Syntra CLI and Admin Panel (syntra.goalixa.com)
API Gateway Layer: FastAPI REST API with Auth Proxy
Orchestration Layer: CrewAI Orchestrator and Agent Manager
Skills Layer: DevOps, Incident, Review, and Planning Skills
Agent Layer: Planner, Evidence Collector, Incident, DevOps, and Reviewer Agents
Tools Layer: Kubernetes Tools, Rule Tools, LLM Tools (Claude), Git Tools
Infrastructure Layer: Kubernetes Cluster, Git Repositories, Incident Memory (ChromaDB)
Skills Layer
What are Skills?
Skills are domain-specific expertise modules that enhance agent capabilities. They provide:
- Domain Knowledge: Patterns, best practices, known issues
- Specialized Tools: Pre-built functions for domain operations
- Documentation: Guides and reference materials
- Incident Patterns: Known failure scenarios and remediation
Available Skills
| Skill | Agent | Capabilities |
|---|---|---|
| DevOps | DevOpsAgent | Deploy, rollback, scale, diagnose pods, ConfigMaps/Secrets |
| Incident | IncidentAgent, EvidenceCollector | Evidence collection, pattern matching, root cause analysis |
| Review | ReviewerAgent | Security checks, code quality, git analysis |
| Planning | PlannerAgent | Task decomposition, estimation, prioritization |
Skill Structure
Each skill follows a consistent structure:
skills/
βββ SKILL.md # Domain knowledge & patterns
βββ __init__.py # Skill implementation
βββ tools/ # Domain-specific tools
β βββ deployment_tools.py
β βββ troubleshooting_tools.py
β βββ config_tools.py
βββ docs/ # Reference documentation
βββ incident_patterns.mdAgent System Design
Agent Hierarchy & Responsibilities
The agents work in a pipeline:
- Planner Agent (Intent Recognition) β Analyzes request
- Evidence Collector (Diagnostic Phase) β Collects data
- Incident Agent (Analysis Phase) β Root cause analysis
- DevOps Agent (Execution Phase) β Executes fix
- Reviewer Agent (Validation Phase) β Validates result
Agent Detailed Specifications
1. Planner Agent - The Incident Commander
Role: Intent Recognition & Workflow Orchestration
Capabilities:
- Natural language request parsing
- Intent classification (diagnose_pod, namespace_overview, collect_evidence, general_help)
- Execution plan generation
- Required information identification
- Agent delegation
2. Evidence Collector Agent - The Detective
Role: Diagnostic Data Gathering
Capabilities:
- Pod state inspection (all containers)
- Log collection with timestamps
- Event filtering and correlation
- Related pod discovery (same deployment, replica sets)
- Kubernetes connectivity verification
3. Incident Agent - The Analyst
Role: Root Cause Analysis & Pattern Detection
Dual Approach:
- Rule-Based Detection (First Pass): OOM kills, CrashLoopBackOff, Image pull failures, Probe failures
- LLM Fallback (Unknown Issues): Complex multi-factor failures, Novel error patterns, Cross-service dependencies
Detection Confidence Threshold: 70%
- Below 70%: Escalate to LLM (Claude)
- Above 70%: Provide rule-based diagnosis
4. DevOps Agent - The Executor
Role: Kubernetes Operations Execution
Safety Model:
- Default: Read-only access
- Write operations require explicit confirmation
- Destructive operations (delete, scale down) blocked in CLI
5. Reviewer Agent - The Validator
Role: Change Validation & Quality Assurance
Authentication & Authorization
Dual Authentication System
Syntra implements two authentication mechanisms:
CLI Authentication: API Key-based (X-API-Key: sk_live_xxx)
- Rate limits: 60 req/min, 1000 req/hour
- Key types:
sk_live_(Production),sk_test_(Read-only),sk_admin_(All + key management)
Web Authentication: JWT-based with Goalixa Auth Service
- Access Token: 15-minute TTL
- Refresh Token: 7-day TTL with rotation
- HTTP-only cookies for web sessions
CLI Workflow
Syntra CLI Commands
# Authentication
syntra login # Authenticate with Goalixa Auth
syntra logout # Logout and clear credentials
syntra status # Show authentication status
# Direct Commands
syntra ask "diagnose pod auth-service" # Single task execution
syntra health # Check Syntra service health
syntra agents # List available agents
syntra tools # List available tools
# Interactive Mode
syntra # Start interactive REPL
syntra repl --enhanced # Enhanced REPL with history
# Configuration
syntra config set endpoint https://syntra.goalixa.com
syntra config set api-key sk_live_abc123
syntra config list # Show all configurationTechnology Stack
| Component | Technology | Purpose |
|---|---|---|
| API Framework | FastAPI | High-performance REST API |
| AI Orchestration | CrewAI | Multi-agent coordination |
| LLM Integration | LangChain + Claude | Advanced reasoning |
| Kubernetes | Python K8s Client | Cluster operations |
| CLI | Typer + Rich | Beautiful command-line interface |
| Authentication | Goalixa Auth | Centralized auth service |
| Vector DB | ChromaDB (planned) | Incident memory |
| Python Version | 3.11+ | Runtime |
Security Model
Defense in Depth
- Layer 1: Authentication - JWT / API Key Validation
- Layer 2: Authorization - Role-Based Access Control
- Layer 3: Rate Limiting - Per-Key Rate Limits
- Layer 4: Agent Safety - Read-Only Default, Confirmation for Writes
- Layer 5: Audit Logging - All Actions Logged
Current Status & Roadmap
Implemented
- PlannerAgent with intent recognition
- EvidenceCollectorAgent with K8s integration
- IncidentAgent with rule-based detection
- FastAPI REST API with authentication
- Kubernetes tools (pod state, logs, events)
- CLI framework (basic)
- Admin panel authentication proxy
- API key management with rate limiting
In Progress
- LLM integration (Claude) for complex incidents
- ChromaDB incident memory
- Complete CLI implementation
- Enhanced admin panel
Planned
- DevOpsAgent full implementation
- ReviewerAgent full implementation
- Git operations tools
- Multi-cluster support
- Advanced analytics dashboard
- Self-healing automation
- Slack/Discord integration
Key Takeaways
- Syntra is a Central AI Teammate - Not just a tool, but an intelligent assistant that learns and collaborates
- Multi-Agent Architecture - Specialized agents handle different aspects of incident response
- Dual Authentication - Supports both API keys (CLI) and JWT (web) for different use cases
- Hybrid AI Approach - Fast rule-based detection + LLM fallback for complex issues
- Safety First - Multiple security layers, read-only defaults, explicit confirmations
- Continuous Learning - Incident memory system improves future responses
Related Posts
- Using Claude for Goalixa - AI development workflow
- ArgoCD Setup: First Step - GitOps infrastructure
- Monitoring Stack Setup - Observability
Connect & Explore
Syntra Repository: https://github.com/goalixa/syntra
Goalixa Project: https://github.com/goalixa
Live Demo: https://syntra.goalixa.com