Syntra Architecture: Deep Dive into AI DevOps Orchestration

Published: 2026-04-03 | Section: AI & Automation | Author: Amirreza Rezaie

Overview

Syntra is the central AI teammate for the Goalixa platform — an intelligent orchestration service that manages incidents, automates DevOps workflows, and provides seamless Kubernetes cluster management through a multi-agent system powered by CrewAI.

What Syntra Solves

As a solo developer building Goalixa, I faced three critical challenges:

Incident Management Overload - Debugging production issues across multiple microservices
Context Switching - Constantly moving between development, deployment, and operations
Operational Complexity - Managing Kubernetes clusters, Git repositories, and service health

Syntra addresses these by acting as an always-available DevOps teammate that:

Investigates incidents automatically
Manages Kubernetes operations with AI
Provides a unified CLI interface for all operations
Learns from past incidents to improve future responses

System Architecture

High-Level Architecture

Agent System Design

Agent Hierarchy & Responsibilities

Agent Detailed Specifications

1. Planner Agent - The Incident Commander

Role: Intent Recognition & Workflow Orchestration

Capabilities:

Natural language request parsing
Intent classification (diagnose_pod, namespace_overview, collect_evidence, general_help)
Execution plan generation
Required information identification
Agent delegation

Input Example:

"Investigate why the auth-service pod is crashing in production"

Output:

{
  "intent": "diagnose_pod",
  "required_params": {
    "namespace": "production",
    "pod_name": "auth-service-*"
  },
  "execution_plan": [
    "EvidenceCollector: Gather pod state, logs, events",
    "IncidentAgent: Analyze patterns and detect root cause",
    "DevOps: Suggest remediation if safe operation detected"
  ]
}

2. Evidence Collector Agent - The Detective

Role: Diagnostic Data Gathering

Capabilities:

Pod state inspection (all containers)
Log collection with timestamps
Event filtering and correlation
Related pod discovery (same deployment, replica sets)
Kubernetes connectivity verification

Evidence Package Structure:

{
  "pod_name": "auth-service-7d4f8b9c-x2k4p",
  "namespace": "production",
  "evidence": {
    "pod_state": {
      "phase": "Running",
      "containers": [
        {
          "name": "auth-service",
          "state": "waiting",
          "reason": "CrashLoopBackOff",
          "restart_count": 5
        }
      ]
    },
    "logs": [
      {
        "container": "auth-service",
        "timestamp": "2026-04-03T10:23:45Z",
        "message": "Connection refused: postgres.goalixa.svc:5432"
      }
    ],
    "events": [
      {
        "type": "Warning",
        "reason": "BackOff",
        "message": "Back-off restarting failed container"
      }
    ],
    "related_pods": [
      "auth-service-7d4f8b9c-y5k7m",
      "auth-service-7d4f8b9c-z9j2n"
    ]
  }
}

3. Incident Agent - The Analyst

Role: Root Cause Analysis & Pattern Detection

Dual Approach:

1. Rule-Based Detection (First Pass)

OOM (Out of Memory) kills
CrashLoopBackOff patterns
Image pull failures
Probe failures (readiness/liveness)

2. LLM Fallback (Unknown Issues)

Complex multi-factor failures
Novel error patterns
Cross-service dependencies

Detection Confidence Threshold: 70%

Below 70%: Escalate to LLM (Claude)
Above 70%: Provide rule-based diagnosis

Output Example:

{
  "root_cause": {
    "type": "database_connection_failure",
    "confidence": 0.85,
    "detection_method": "rule_based",
    "evidence": [
      "Connection refused error in logs",
      "Database pod not found in namespace",
      "Service endpoint missing"
    ]
  },
  "remediation": {
    "immediate": "Check if PostgreSQL deployment exists in production namespace",
    "verification": "kubectl get pods -n production -l app=postgres",
    "prevention": "Add dependency check in deployment pipeline"
  }
}

4. DevOps Agent - The Executor

Role: Kubernetes Operations Execution

Capabilities:

Pod inspection and status checks
Log retrieval and filtering
Service and deployment queries
Safe operations (read-only by default)
Namespace overviews

Safety Model:

Default: Read-only access
Write operations require explicit confirmation
Destructive operations (delete, scale down) blocked in CLI

5. Reviewer Agent - The Validator

Role: Change Validation & Quality Assurance

Capabilities:

Execution result validation
Error detection and reporting
Best practices compliance checking
Incident history correlation

Authentication & Authorization

Dual Authentication System

Syntra implements two authentication mechanisms for different use cases:

API Key Authentication (CLI)

Endpoint: POST /api/ask

Headers:

X-API-Key: sk_live_abc123...
Content-Type: application/json

Rate Limits:

60 requests/minute
1000 requests/hour
Per-key limits

Key Types:

Key Prefix	Type	Permissions
`sk_live_`	Production	Full access
`sk_test_`	Test	Read-only
`sk_admin_`	Admin	All + key management

JWT Authentication (Admin Panel)

Token Structure:

Access Token (15-minute TTL):

{
  "user_id": "user_abc123",
  "email": "user@example.com",
  "roles": ["admin", "developer"],
  "permissions": [
    "kubernetes:read",
    "kubernetes:write",
    "incident:create"
  ],
  "exp": 1714876800
}

Refresh Token (7-day TTL):

Stored in HTTP-only cookie
Auto-rotates on refresh
Revoked on logout/password change

Data Flow: Incident Investigation

Complete Incident Response Flow

CLI Workflow

Syntra CLI Interface

CLI Commands

# Authentication
syntra login                    # Authenticate with Goalixa Auth
syntra logout                   # Logout and clear credentials
syntra status                   # Show authentication status
 
# Direct Commands
syntra ask "diagnose pod auth-service"    # Single task execution
syntra health                   # Check Syntra service health
syntra agents                   # List available agents
syntra tools                    # List available tools
 
# Interactive Mode
syntra                          # Start interactive REPL
syntra repl --enhanced          # Enhanced REPL with history
 
# Configuration
syntra config set endpoint https://syntra.goalixa.com
syntra config set api-key sk_live_abc123
syntra config list              # Show all configuration

Admin Panel

Admin Panel Features

Technology Stack

Core Technologies

Component	Technology	Purpose
API Framework	FastAPI	High-performance REST API
AI Orchestration	CrewAI	Multi-agent coordination
LLM Integration	LangChain + Claude	Advanced reasoning
Kubernetes	Python K8s Client	Cluster operations
CLI	Typer + Rich	Beautiful command-line interface
Authentication	Goalixa Auth	Centralized auth service
Vector DB	ChromaDB (planned)	Incident memory
Python Version	3.11+	Runtime

Deployment Architecture

Kubernetes Deployment

Security Model

Defense in Depth

Security Features

Authentication
- JWT tokens with short TTL (15 minutes)
- API key rotation support
- HTTP-only cookies for web sessions
Authorization
- Role-based access control (RBAC)
- Granular permissions per operation
- Namespace-level isolation
Rate Limiting
- Per-key limits (60 req/min, 1000 req/hour)
- IP-based fallback limits
- Configurable per user role
Agent Safety
- Read-only by default
- Explicit confirmation for destructive ops
- Operation validation before execution
Audit Trail
- All actions logged with user context
- Incident history tracking
- Change attribution

Incident Memory System

Learning from Past Incidents

Memory Structure (Planned with ChromaDB)

{
  "incident_id": "inc_20260403_001",
  "timestamp": "2026-04-03T10:23:45Z",
  "namespace": "production",
  "affected_resources": ["auth-service-7d4f8b9c-x2k4p"],
  "symptoms": [
    "CrashLoopBackOff",
    "Connection refused to postgres"
  ],
  "root_cause": {
    "type": "database_connection_failure",
    "confidence": 0.85
  },
  "remediation": {
    "action": "deployed missing postgres service",
    "verified": true
  },
  "tags": ["database", "connectivity", "missing-dependency"],
  "embedding": [0.123, 0.456, ...]  // Vector embedding for similarity search
}

Current Status & Roadmap

Implementation Status

Implemented ✅

PlannerAgent with intent recognition
EvidenceCollectorAgent with K8s integration
IncidentAgent with rule-based detection
FastAPI REST API with authentication
Kubernetes tools (pod state, logs, events)
CLI framework (basic)
Admin panel authentication proxy
API key management with rate limiting

In Progress 🚧

LLM integration (Claude) for complex incidents
ChromaDB incident memory
Complete CLI implementation
Enhanced admin panel

Planned 📋

DevOpsAgent full implementation
ReviewerAgent full implementation
Git operations tools
Multi-cluster support
Advanced analytics dashboard
Self-healing automation
Slack/Discord integration

Key Takeaways

Syntra is a Central AI Teammate - Not just a tool, but an intelligent assistant that learns and collaborates
Multi-Agent Architecture - Specialized agents handle different aspects of incident response
Dual Authentication - Supports both API keys (CLI) and JWT (web) for different use cases
Hybrid AI Approach - Fast rule-based detection + LLM fallback for complex issues
Safety First - Multiple security layers, read-only defaults, explicit confirmations
Continuous Learning - Incident memory system improves future responses

Using Claude for Goalixa - AI development workflow
ArgoCD Setup: First Step - GitOps infrastructure
Monitoring Stack Setup - Observability

Connect & Explore

Syntra Repository: https://github.com/goalixa/syntra

Goalixa Project: https://github.com/goalixa

Live Demo: https://syntra.goalixa.com

Tags: #ai #architecture #devops #multi-agent #kubernetes #system-design #crewai #orchestration

Landing Page Syntra

Syntra Architecture: Deep Dive into AI DevOps Orchestration

Overview

What Syntra Solves

System Architecture

High-Level Architecture

Agent System Design

Agent Hierarchy & Responsibilities

Agent Detailed Specifications

1. Planner Agent - The Incident Commander

2. Evidence Collector Agent - The Detective

3. Incident Agent - The Analyst

4. DevOps Agent - The Executor

5. Reviewer Agent - The Validator

Authentication & Authorization

Dual Authentication System

API Key Authentication (CLI)

JWT Authentication (Admin Panel)

Data Flow: Incident Investigation

Complete Incident Response Flow

CLI Workflow

Syntra CLI Interface

CLI Commands

Admin Panel

Admin Panel Features

Technology Stack

Core Technologies

Deployment Architecture

Kubernetes Deployment

Security Model

Defense in Depth

Security Features

Incident Memory System

Learning from Past Incidents

Memory Structure (Planned with ChromaDB)

Current Status & Roadmap

Implementation Status

Key Takeaways

Related Posts

Connect & Explore