Agent System

Praetor's agent system consists of 18 specialized AI agents that work in coordination to transform user inputs into comprehensive software specifications. Each agent has a specific domain of expertise

Agent System

Table of Contents


Overview

Praetor's agent system consists of 18 specialized AI agents that work in coordination to transform user inputs into comprehensive software specifications. Each agent has a specific domain of expertise and can be orchestrated in various patterns (sequential, parallel, hierarchical, debate).

The agent system is built on the Mastra framework, providing:

  • Persistent memory across conversations
  • Workflow-based orchestration
  • Token budget management
  • Checkpoint-based recovery

All 18 Agents

Conversation & Planning Agents

1. conversation-agent

Purpose: Multi-turn conversational interface for user interactions

Capabilities:

  • Natural language understanding of user intent
  • Context-aware follow-up questions
  • Clarification requests when ambiguous
  • Progress tracking and status updates

Used In: Discovery phase, general user interactions


2. discovery-orchestrator

Purpose: Conversational requirements gathering through natural conversation

Capabilities:

  • Natural, adaptive conversation flow (not rigid checklist)
  • Structured data extraction with confidence scoring (0-1 scale)
  • Real-time extraction to 7 JSONB fields (purpose, users, success criteria, constraints, non-functional, domain-specific, domain type)
  • Progress tracking with completion gates
  • Domain detection (service booking, ecommerce, SaaS, website)
  • Dynamic UI suggestions (freeform, choice cards, hybrid)
  • Audit trail for every extraction (source message, confidence, type)

Extraction Types:

  • fact: User stated directly
  • decision: User chose from options
  • inference: Derived from context

Supporting Services:

  • extraction-service.ts: Handles storing extractions to database, manages discovery context CRUD, generates feature seeds with priority hints
  • post-message-processor.ts: Triggers dynamic pool activation based on extracted data (existing systems, constraints, sensitive data, scale)

Output:

  • Structured requirements_discovery_contexts with complete audit trail
  • Feature seeds with priority hints (critical/high/medium/low) for intelligent feature matching
  • Activated question pools based on technology/domain mentions

Used In: Phase 1 Requirements Discovery (conversational flow)

Distinct From:

  • conversation-agent: General-purpose conversation (not requirements-focused)
  • design-discovery: UI/block selection (not requirements gathering)

3. pattern-selector

Purpose: Selects appropriate execution patterns based on project context

Capabilities:

  • Analyzes project requirements
  • Matches against 85 pattern catalog
  • Scores pattern relevance
  • Suggests pattern combinations

Used In: Plan generation, specification compilation


Specification Compilation Agents

4. dod-compiler

Purpose: Compiles Definition of Done specifications

Capabilities:

  • Extracts completion criteria from requirements
  • Generates testable acceptance criteria
  • Creates mechanical verification gates
  • Ensures no "soft" done claims

Output: DoD document with:

  • Feature-level done criteria
  • Integration requirements
  • Quality gates
  • Verification methods

5. test-plan-compiler

Purpose: Generates comprehensive test plans

Capabilities:

  • Creates test cases from requirements
  • Generates edge case scenarios
  • Defines test data requirements
  • Establishes pass/fail criteria

Output: Test plan with:

  • Unit test specifications
  • Integration test scenarios
  • E2E test flows
  • Performance benchmarks

6. task-graph-compiler

Purpose: Builds DAG-based task dependencies

Capabilities:

  • Decomposes features into tasks
  • Identifies task dependencies
  • Detects circular dependencies
  • Calculates critical path

Output: Task graph (DAG) with:

  • Task nodes with estimates
  • Dependency edges
  • Parallel execution groups
  • Critical path highlighting

7. runbook-generator

Purpose: Generates operational runbooks

Capabilities:

  • Creates step-by-step execution guides
  • Documents rollback procedures
  • Defines monitoring checkpoints
  • Specifies escalation paths

Output: Runbook with:

  • Pre-deployment checklist
  • Deployment steps
  • Verification commands
  • Rollback procedures

Detailed Section Compilers (SpecV2)

8. requirements-compiler

Purpose: Compiles detailed functional and non-functional requirements

Capabilities:

  • Extracts requirements from Q&A
  • Categorizes by type (functional, non-functional, constraint)
  • Assigns priority levels
  • Links to source answers

9. design-system-compiler

Purpose: Generates design system specifications

Capabilities:

  • Compiles color palettes
  • Defines typography scales
  • Specifies spacing systems
  • Creates component patterns

10. architecture-compiler

Purpose: Compiles technical architecture specifications

Capabilities:

  • Defines system components
  • Specifies integrations
  • Documents data flow
  • Creates deployment topology

11. implementation-plan-compiler

Purpose: Creates detailed implementation roadmaps

Capabilities:

  • Breaks down into milestones
  • Assigns task sequencing
  • Identifies blockers
  • Estimates complexity

Finish Line & Quality Assurance Agents

12. completion-audit

Purpose: Audits project completion status against DoD

Capabilities:

  • Compares current state to DoD
  • Identifies gaps and missing items
  • Calculates completion percentage
  • Generates remediation list

Critical Feature: Prevents false "done" claims through mechanical verification


13. failure-classification

Purpose: Classifies and analyzes execution failures

Capabilities:

  • Categorizes failure types
  • Identifies root causes
  • Suggests remediation strategies
  • Tracks failure patterns

Failure Categories:

  • Missing requirement
  • Ambiguous specification
  • Technical constraint
  • External dependency
  • Implementation error

14. patch-planning

Purpose: Plans minimal patches for failures

Capabilities:

  • Analyzes failure context
  • Identifies minimal fix scope
  • Plans patch sequence
  • Estimates patch effort

15. patch-execution

Purpose: Executes planned patches

Capabilities:

  • Applies patches in sequence
  • Validates patch success
  • Rolls back on failure
  • Updates affected artifacts

Research & Intelligence Agents

16. market-research

Purpose: Conducts market analysis for project context

Capabilities:

  • Identifies competitor products
  • Analyzes market positioning
  • Researches pricing strategies
  • Gathers feature comparisons

Used In: Discovery enhancement, feature prioritization


17. design-intelligence

Purpose: Gathers design insights and inspiration

Capabilities:

  • Researches design patterns
  • Analyzes competitor UX
  • Identifies accessibility requirements
  • Suggests design systems

Used In: Visual discovery phase


18. technical-feasibility

Purpose: Assesses technical viability of features

Capabilities:

  • Evaluates technology choices
  • Identifies technical risks
  • Assesses integration complexity
  • Estimates development effort

Used In: Feature selection, planning phase


Agent Orchestration

Coordination Strategies

Agents can be orchestrated using four primary strategies:

1. Sequential Execution

Agent A ──▶ Agent B ──▶ Agent C ──▶ Result
  • One agent completes before next starts
  • Output of each feeds into next
  • Simplest but slowest

Use Case: Dependent tasks (compile spec → generate tests → create runbook)

2. Parallel Execution

         ┌──▶ Agent A ──┐
Input ───┼──▶ Agent B ──┼──▶ Merge ──▶ Result
         └──▶ Agent C ──┘
  • Multiple agents run simultaneously
  • Results merged at end
  • Fastest for independent tasks

Use Case: Research tasks (market + design + technical feasibility)

3. Hierarchical Execution

                 Supervisor

        ┌────────────┼────────────┐
        ▼            ▼            ▼
    Agent A      Agent B      Agent C
        │            │            │
        └────────────┴────────────┘

                  Supervisor

                  Result
  • Supervisor agent coordinates workers
  • Workers report back to supervisor
  • Supervisor synthesizes results

Use Case: Complex specification compilation with quality control

4. Debate Execution

    Agent A ◀─────────▶ Agent B
        │      debate      │
        └────────┬─────────┘

             Arbiter

              Result
  • Multiple agents argue positions
  • Arbiter resolves disagreements
  • Produces higher-quality decisions

Use Case: Ambiguous requirements, architectural decisions

Orchestration Implementation

// From src/core/agents/orchestration/
interface OrchestrationConfig {
  strategy: 'sequential' | 'parallel' | 'hierarchical' | 'debate';
  agents: AgentConfig[];
  timeout: number;
  budgetLimit: TokenBudget;
  checkpointInterval: number;
}

// Execute with strategy
const result = await agentOrchestrator.execute({
  strategy: 'hierarchical',
  agents: [
    { name: 'requirements-compiler', role: 'worker' },
    { name: 'architecture-compiler', role: 'worker' },
    { name: 'completion-audit', role: 'supervisor' }
  ],
  timeout: 300000, // 5 minutes
  budgetLimit: { maxTokens: 100000, maxCost: 5.00 }
});

LLM Model Routing

Cost Optimization Strategy

Praetor uses a tiered routing strategy to minimize costs while maintaining quality:

┌─────────────────────────────────────────────────────────────────┐
│                     MODEL ROUTING TIERS                         │
│                                                                 │
│  ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐     │
│  │  CACHE  │───▶│  RULES  │───▶│ TIER-1  │───▶│ TIER-2  │     │
│  │         │    │ ENGINE  │    │  FAST   │    │ POWERFUL│     │
│  │  FREE   │    │  FREE   │    │  CHEAP  │    │EXPENSIVE│     │
│  └─────────┘    └─────────┘    └─────────┘    └─────────┘     │
│       │              │              │              │           │
│   Hit: Use      Match: Use     Simple:       Complex:         │
│   cached        rule-based     GPT-3.5/      GPT-4/           │
│   response      response       Haiku         Claude           │
└─────────────────────────────────────────────────────────────────┘

Tier Definitions

TierModelsCostUse Cases
CacheN/AFreeRepeated questions, common patterns
RulesN/AFreeBoolean decisions, enum selections
Tier-1GPT-3.5, Claude Haiku$0.002/1KSimple extractions, classifications
Tier-2GPT-4, Claude Opus$0.03-0.06/1KComplex reasoning, generation

Routing Logic

// Simplified routing decision
function selectModel(task: AgentTask): ModelConfig {
  // 1. Check cache
  const cached = await cache.get(task.cacheKey);
  if (cached) return { type: 'cache', result: cached };

  // 2. Check rules engine
  const rule = rules.match(task);
  if (rule) return { type: 'rules', result: rule.apply(task) };

  // 3. Classify complexity
  const complexity = classifyComplexity(task);

  if (complexity === 'simple') {
    return { model: 'gpt-3.5-turbo', tier: 1 };
  }

  return { model: 'gpt-4-turbo', tier: 2 };
}

Cost Savings

This routing strategy achieves 60-80% cost reduction compared to using GPT-4 for all tasks:

  • ~40% of calls hit cache or rules
  • ~35% use Tier-1 models
  • ~25% require Tier-2 models

Memory and State Management

Agent Memory Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      AGENT MEMORY                               │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                  PostgreSQL Storage                       │  │
│  │                   (agent_memory table)                    │  │
│  │                                                          │  │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────────────┐ │  │
│  │  │  Session   │  │  Tenant    │  │    Conversation    │ │  │
│  │  │  Context   │  │  Context   │  │      History       │ │  │
│  │  │            │  │            │  │                    │ │  │
│  │  │ - user_id  │  │ - tenant   │  │ - messages[]       │ │  │
│  │  │ - project  │  │ - settings │  │ - summaries[]      │ │  │
│  │  │ - phase    │  │ - patterns │  │ - tokens_used      │ │  │
│  │  └────────────┘  └────────────┘  └────────────────────┘ │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   Token Management                        │  │
│  │                                                          │  │
│  │  TokenLimiter: 120,000 tokens max                        │  │
│  │  lastMessages: 100 (rolling window)                      │  │
│  │  semanticRecall: disabled (PostgreSQL-based)             │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Memory Table Schema

CREATE TABLE agent_memory (
  id UUID PRIMARY KEY,
  tenant_id UUID REFERENCES tenants(id),
  session_id UUID,
  agent_name VARCHAR(100),
  message_role VARCHAR(20), -- 'user', 'assistant', 'system'
  message_content TEXT,
  tokens_used INTEGER,
  created_at TIMESTAMP DEFAULT NOW(),
  metadata JSONB
);

-- Session isolation
CREATE INDEX idx_agent_memory_session ON agent_memory(session_id);
CREATE INDEX idx_agent_memory_tenant ON agent_memory(tenant_id);

Session Management

// Memory configuration from mastra.config.ts
const memoryConfig = {
  storage: new PostgresStore({ connectionString: DATABASE_URL }),
  options: {
    lastMessages: 100,           // Keep last 100 messages
    semanticRecall: false,       // Use PostgreSQL, not vector DB
    tokenLimiter: new TokenLimiter({
      maxTokens: 120000,         // 120K token limit
      strategy: 'summarize'      // Summarize when approaching limit
    })
  }
};

Session Cleanup

Sessions older than 30 minutes are automatically cleaned:

// Background cleanup (runs every 30 minutes)
async function cleanupStaleSessions() {
  const cutoff = new Date(Date.now() - 30 * 60 * 1000);
  await db.query(`
    DELETE FROM agent_memory
    WHERE created_at < $1
    AND session_id NOT IN (
      SELECT DISTINCT session_id FROM active_sessions
    )
  `, [cutoff]);
}

Execution System

Continuous Execution Loop

The agent execution system uses a continuous loop with checkpoints:

┌─────────────────────────────────────────────────────────────────┐
│                   EXECUTION LOOP                                │
│                                                                 │
│  ┌─────────┐                                                   │
│  │  START  │                                                   │
│  └────┬────┘                                                   │
│       │                                                        │
│       ▼                                                        │
│  ┌─────────────────────────────────────────────┐              │
│  │             CHECK BUDGET                     │              │
│  │  tokens_used < budget_limit?                 │              │
│  │  cost_incurred < cost_limit?                 │              │
│  └──────────────────┬──────────────────────────┘              │
│                     │                                          │
│          ┌─────────┴─────────┐                                │
│          │                   │                                │
│       budget OK          budget exceeded                      │
│          │                   │                                │
│          ▼                   ▼                                │
│  ┌─────────────┐     ┌─────────────┐                         │
│  │   EXECUTE   │     │    PAUSE    │                         │
│  │    STEP     │     │   REQUEST   │                         │
│  │             │     │   APPROVAL  │                         │
│  └──────┬──────┘     └─────────────┘                         │
│         │                                                     │
│         ▼                                                     │
│  ┌─────────────────────────────────────────────┐              │
│  │           CREATE CHECKPOINT                  │              │
│  │  - Save current state                        │              │
│  │  - Record progress                           │              │
│  │  - Enable resume                             │              │
│  └──────────────────┬──────────────────────────┘              │
│                     │                                          │
│          ┌─────────┴─────────┐                                │
│          │                   │                                │
│       more steps         all done                             │
│          │                   │                                │
│          │                   ▼                                │
│          │           ┌─────────────┐                          │
│          └──────────▶│   COMPLETE  │                          │
│                      └─────────────┘                          │
└─────────────────────────────────────────────────────────────────┘

Checkpoint System

Checkpoints enable recovery from failures and long-running operations:

interface Checkpoint {
  id: string;
  execution_id: string;
  step_index: number;
  state: {
    completedSteps: string[];
    pendingSteps: string[];
    intermediateResults: Record<string, any>;
    tokensUsed: number;
    costIncurred: number;
  };
  created_at: Date;
  expires_at: Date;
}

// Create checkpoint after each step
async function createCheckpoint(execution: Execution) {
  await db.query(`
    INSERT INTO checkpoint_snapshots (
      execution_id, step_index, state, created_at, expires_at
    ) VALUES ($1, $2, $3, NOW(), NOW() + INTERVAL '24 hours')
  `, [execution.id, execution.currentStep, execution.state]);
}

// Resume from checkpoint
async function resumeFromCheckpoint(executionId: string) {
  const checkpoint = await db.query(`
    SELECT * FROM checkpoint_snapshots
    WHERE execution_id = $1
    ORDER BY step_index DESC
    LIMIT 1
  `, [executionId]);

  return continueExecution(checkpoint.state);
}

Budget Enforcement

interface BudgetLimit {
  maxTokens: number;
  maxCost: number;
  warningThreshold: number; // e.g., 0.8 = warn at 80%
}

// Budget check before each step
async function checkBudget(execution: Execution): Promise<BudgetStatus> {
  const budget = await getBudgetLimit(execution.project_id);
  const usage = await getUsage(execution.id);

  if (usage.tokens >= budget.maxTokens || usage.cost >= budget.maxCost) {
    return { status: 'exceeded', requiresApproval: true };
  }

  if (usage.tokens >= budget.maxTokens * budget.warningThreshold) {
    return { status: 'warning', message: 'Approaching budget limit' };
  }

  return { status: 'ok' };
}

Handoff Protocol

Agents can hand off to other agents during execution:

interface AgentHandoff {
  from: string;      // Source agent
  to: string;        // Target agent
  reason: string;    // Why handoff needed
  context: {
    task: string;
    progress: any;
    requirements: any;
  };
}

// Handoff example
const handoff: AgentHandoff = {
  from: 'requirements-compiler',
  to: 'architecture-compiler',
  reason: 'Requirements complete, need architecture decisions',
  context: {
    task: 'compile-architecture',
    progress: { requirementsComplete: true },
    requirements: compiledRequirements
  }
};

Mastra Integration

Configuration

// mastra.config.ts
import { Mastra } from '@mastra/core';
import { PostgresStore } from '@mastra/pg';

export const mastra = new Mastra({
  agents: {
    'conversation-agent': conversationAgent,
    'discovery-orchestrator': discoveryOrchestratorAgent,
    'pattern-selector': patternSelectorAgent,
    'dod-compiler': dodCompilerAgent,
    // ... all 18 agents
  },
  workflows: {
    'variant-builder': variantBuilderWorkflow,
    'compile-spec': compileSpecWorkflow,
    'multi-agent-spec': multiAgentSpecWorkflow,
    // ... all 10 workflows
  },
  storage: new PostgresStore({
    connectionString: process.env.DATABASE_URL
  }),
  memory: {
    lastMessages: 100,
    semanticRecall: false,
    tokenLimiter: new TokenLimiter({ maxTokens: 120000 })
  }
});

Workflow Definition Pattern

// Example workflow definition
const compileSpecWorkflow = createWorkflow({
  name: 'compile-spec',
  description: 'Compile full specification from project data',
  steps: [
    createStep({
      name: 'gather-inputs',
      execute: async ({ context }) => {
        // Gather discovery output, plan answers, features
        return { inputs: gatheredInputs };
      }
    }),
    createStep({
      name: 'extract-entities',
      execute: async ({ context, inputs }) => {
        // Extract entities using ontology
        return { entities: extractedEntities };
      }
    }),
    createStep({
      name: 'compile-sections',
      execute: async ({ context, entities }) => {
        // Compile each spec section
        return { sections: compiledSections };
      }
    }),
    createStep({
      name: 'validate-spec',
      execute: async ({ context, sections }) => {
        // Run validation agents
        return { spec: validatedSpec };
      }
    })
  ]
});

Agent Tool Integration

Agents can use tools for external capabilities:

// Research tools available to agents
const researchTools = {
  webSearch: createTool({
    name: 'web-search',
    description: 'Search the web for information',
    execute: async ({ query }) => {
      return await searchProvider.search(query);
    }
  }),
  competitorAnalysis: createTool({
    name: 'competitor-analysis',
    description: 'Analyze competitor products',
    execute: async ({ competitors }) => {
      return await analyzeCompetitors(competitors);
    }
  })
};

// Attach tools to agent
const marketResearchAgent = createAgent({
  name: 'market-research',
  tools: [researchTools.webSearch, researchTools.competitorAnalysis],
  // ...
});

See also: 02-product-architecture.md for system architecture, 04-features-current.md for feature details

Last Updated: February 2025

Command Palette

Search for a command to run...