Unified Project Graph

The bidirectional, deterministically queryable graph that connects every spec artifact to every code artifact through typed, traversable edges.

Unified Project Graph

The Unified Project Graph (UPG) is Praetor's central competitive advantage. It is a six-layer typed knowledge graph where every meaningful artifact in a software project — from a business requirement down to a deployed infrastructure service — is a node, every meaningful relationship is a typed edge, and the cross-layer connections between intent and implementation are maintained continuously.

Every interesting question about a codebase — "what does this function affect?", "is this spec implemented?", "what breaks if I change this?" — is answered by a deterministic SQL graph traversal. No AI inference. No probabilistic matching. No guessing.


Backing Tables

The entire graph lives in two PostgreSQL tables.

context_artifacts — Nodes

Every node in the graph is a row in context_artifacts (migration 194, extended by migrations 362 and 427).

ColumnTypePurpose
idUUIDPrimary key
tenant_idUUIDRow-level security isolation
project_idUUIDWhich project owns this node
artifact_typeTEXTThe layer and type of the node (e.g., spec_entity, code_function, test_case)
artifact_keyTEXTStable human-readable identifier for upsert semantics (e.g., spec:entity:User)
nameTEXTDisplay name
contentTEXTFull content (spec prose, code excerpt, or structured JSON)
token_countINTEGERPre-computed token count for context budget management
versionINTEGERVersion number; incremented on each update
layerTEXTLayer discriminator for fast filtering
priorityINTEGERInclusion priority for context assembly (1–10)
tagsTEXT[]Free-form tags for filtering
metadataJSONBType-specific metadata (fields, guards, data flow hints, confidence scores)
is_activeBOOLEANSoft-delete flag
is_staleBOOLEANSet by the cascade system when an upstream artifact changes
source_tableTEXTProvenance: which DB table produced this artifact
source_idTEXTProvenance: which row in source_table produced this artifact
embeddingvector(1536)Optional pgvector embedding for semantic search
expires_atTIMESTAMPTZOptional TTL for ephemeral artifacts
created_atTIMESTAMPTZCreation timestamp
updated_atTIMESTAMPTZLast-modified timestamp (auto-maintained by trigger)

The artifact_key column has a unique index on (project_id, artifact_type, artifact_key), enabling upsert-based registration: re-ingesting the same spec element always resolves to the same node.

Row-level security is enforced via app.tenant_id. All reads and writes must pass through withTenant().

context_artifact_dependencies — Edges

Every edge in the graph is a row in context_artifact_dependencies (migration 197, extended by migrations 422 and 427).

ColumnTypePurpose
idUUIDPrimary key
artifact_idUUID FKThe dependent node (the one that needs the other to exist first)
depends_on_idUUID FKThe prerequisite node
dependency_typeTEXTThe typed relationship label — constrained to the full taxonomy below
metadataJSONBEdge metadata: confidence score (0.0–1.0), condition labels, diff snapshots
created_atTIMESTAMPTZCreation timestamp

The direction convention is: depends_on_id is the node that must exist first; artifact_id is the dependent. For bridge edges, the convention is reversed by naming: a code_function row's artifact_id holds the code node and depends_on_id holds the spec node it implements.

The dependency_type column has a CHECK constraint that enumerates every valid edge type. Attempting to write an unrecognized edge type is a database-level error, not a silent no-op.


The Six Layers

The UPG distinguishes six layers by the artifact_type prefix. The deep-dive below covers each layer's node types and their purpose.

graph TB
    subgraph "Layer 1 — Business Intent (spec_*)"
        L1["spec_project<br/>spec_bounded_context<br/>spec_epic<br/>spec_user_story<br/>spec_acceptance_criterion<br/>spec_constraint<br/>spec_persona"]
    end
    subgraph "Layer 2 — System Structure (spec_*)"
        L2["spec_entity · spec_field<br/>spec_service · spec_operation<br/>spec_endpoint · spec_schema · spec_guard"]
    end
    subgraph "Layer 3 — Behavior (spec_*)"
        L3["spec_flow_step · spec_workflow · spec_workflow_step<br/>spec_ui_page · spec_ui_component<br/>spec_ui_state · spec_ui_layout"]
    end
    subgraph "Layer 4 — Implementation Plan (impl_*)"
        L4["impl_entity · impl_service · impl_operation<br/>impl_endpoint · impl_guard · impl_workflow<br/>impl_ui_page · impl_ui_component"]
    end
    subgraph "Layer 5 — Code (code_*)"
        L5["code_module · code_class · code_function<br/>code_endpoint · code_table · code_schema<br/>code_component · code_route"]
    end
    subgraph "Layer 6 — Verification & Infrastructure (test_* · infra_*)"
        L6["test_suite · test_case · test_assertion · test_fixture<br/>infra_service · infra_dependency · infra_config"]
    end

    L1 -->|"realizes (impl)"| L4
    L2 -->|"realizes (impl)"| L4
    L3 -->|"realizes (impl)"| L4
    L4 -->|"generated_from"| L5
    L5 <-->|"implements / drifts_from"| L2
    L5 -->|"tests"| L6

Layer 1: Business Intent (spec_*)

What the product is supposed to do. Populated by the 6-phase elicitation engine.

Node TypeWhat It RepresentsExample
spec_projectThe project itself"ClientHub CRM"
spec_bounded_contextA domain boundary"Customer Management", "Billing"
spec_epicA major feature group"User Authentication"
spec_user_storyA user-facing capability"As a sales rep, I want to search contacts by company"
spec_acceptance_criterionA testable condition"Search returns results within 200ms"
spec_constraintA non-functional requirement"Must comply with GDPR data residency"
spec_personaA user type"Sales Representative", "Admin"

Layer 2: System Structure (spec_* continued)

How the system is architecturally organized. Derived from discovery and domain analysis.

Node TypeWhat It RepresentsExample
spec_entityA domain object"Customer", "Order", "Invoice"
spec_fieldA property of an entity"Customer.email (string, required, unique)"
spec_serviceA logical service boundary"AuthService", "OrderService"
spec_operationAn action a service performs"AuthService.register(email, password) → User"
spec_endpointAn API surface"POST /api/auth/register"
spec_schemaA data shape / validationRegisterRequest { email: string, password: string }
spec_guardA security/business rule"requireAuth", "requireRole('admin')"

Layer 3: Behavior (spec_* continued)

How the system behaves — journeys, workflows, conditional logic.

Node TypeWhat It RepresentsExample
spec_flow_stepA step in a user journey"User fills in registration form"
spec_workflowAn async multi-step process"Order fulfillment pipeline"
spec_workflow_stepA step in a workflow"Validate payment → Reserve inventory → Ship"
spec_ui_pageA page in the application"/dashboard", "/customers/:id"
spec_ui_componentA UI component"CustomerTable", "OrderForm"
spec_ui_stateA UI state condition"loading", "error", "empty"
spec_ui_layoutA layout container"DashboardLayout with sidebar"

Layer 4: Implementation Plan (impl_*)

The bridge between "what" (spec) and "how" (code). Contains all decisions about file paths, imports, types, and patterns. Populated by the implementation plan generator.

Node TypeWhat It RepresentsExample
impl_entityImplementation plan for an entity"customers table at src/db/schema/customers.ts using Drizzle"
impl_serviceImplementation plan for a service"AuthService at src/services/auth/auth-service.ts"
impl_operationImplementation plan for an operation"register() imports hashPassword from @/lib/crypto, validates with Zod"
impl_endpointImplementation plan for a route"POST /api/auth/register at src/routes/auth.ts using Hono"
impl_guardImplementation plan for a guard"requireAuth middleware at src/middleware/auth.ts"
impl_workflowImplementation plan for a workflow"Inngest function at src/inngest/order-fulfillment.ts"
impl_ui_pageImplementation plan for a page"Next.js page at app/customers/[id]/page.tsx"
impl_ui_componentImplementation plan for a component"CustomerTable component using shadcn DataTable"

Layer 5: Code (code_*)

What was actually built. Populated by the brownfield CPG pipeline (for existing code) or by the codegen pipeline (for generated code).

Node TypeWhat It RepresentsExample
code_moduleA source file"src/services/auth/auth-service.ts"
code_classA class definition"AuthService class"
code_functionA function or method"register(email: string, password: string)"
code_endpointAn HTTP route handler"POST /api/auth/register handler"
code_tableA database table"customers table (Drizzle schema)"
code_schemaA validation schema"RegisterRequestSchema (Zod)"
code_componentA UI component"CustomerTable React component"
code_routeA frontend route"/customers/:id Next.js page"

Layer 6: Verification & Infrastructure (test_*, infra_*)

What proves correctness and what the system runs on.

Node TypeWhat It RepresentsExample
test_suiteA test file"auth-service.test.ts"
test_caseA single test"should hash password before storing"
test_assertionA specific check"expect(user.password).not.toBe(plaintext)"
test_fixtureTest data"testUser fixture with known credentials"
infra_serviceA deployed service"postgres container on Railway"
infra_dependencyAn external dependency"Stripe API, OpenRouter"
infra_configConfiguration"DATABASE_URL environment variable"

The Four Spec Graph Edge Layers

The spec graph has four edge-type layers, directly analogous to the four CPG layers on the code side.

Spec-AST: Structural Containment

What contains what. The hierarchy of the spec.

Edge TypeFrom → ToExample
containsparent → childspec_epicspec_user_story
has_fieldentity → fieldspec_entity "Customer"spec_field "email"
has_operationservice → operationspec_service "AuthService"spec_operation "register"
requiresdependent → dependencyspec_operation "createOrder"spec_entity "Customer"

Spec-Flow: Journey Sequences

The ordered sequence of steps in a user journey. This is the spec-side equivalent of the code EOG — what is supposed to happen, in what order, with what branches.

Edge TypeFrom → ToExample
flow_nextstep → next step"Fill form" → "Submit" → "See confirmation"
flow_branchstep → conditional step"Submit" → "Show error" (condition: validation fails)

If the code's EOG traversal of a handler doesn't match this Spec-Flow sequence, the difference is computable as a graph diff — not a judgment call.

Spec-DFG: Intended Data Movement

Where data is supposed to flow between components. Defines intended contracts for the code DFG to satisfy.

Edge TypeFrom → ToExample
data_intends_to_flowsource → destinationspec_entity "Order"spec_operation "calculateTotal"

Missing spec-DFG edge in the code graph = gap. Unspecified code-DFG edge = potential unintended behavior.

Spec-CDG: Business Rules as Guard Conditions

Business constraints as explicit control dependence. These become required CDG ancestors in the generated code.

Edge TypeFrom → ToExample
guardsguard → protected operationspec_guard "requireAuth"spec_endpoint "POST /orders"

These are verifiable requirements: in the generated code, each guarded operation must have a CDG ancestor node that implements the stated condition. If it doesn't, it is a specification violation.


The Four CPG Code Graph Layers

The code-side graph comes from CPG analysis (Fraunhofer CPG library, Tree-sitter, Semgrep) of the actual codebase.

AST: Structural Containment

Static relationships between code elements — what calls what, what imports what, what extends what.

Edge TypeFrom → ToExample
referencesfunction → functionregister() calls hashPassword()
importsmodule → moduleauth-service.ts imports crypto.ts
extendsclass → classAdminService extends UserService

EOG: Execution Order Graph

Which code executes after what — control flow paths, branches, loops.

Edge TypeFrom → ToExample
controls_flow_tofunction → functionExecution path from handler to service

EOG paths are compared against Spec-Flow sequences to detect behavioral drift.

DFG: Data Flow Graph

Where values come from and where they go — the actual data movement through the code.

Edge TypeFrom → ToExample
data_flows_tonode → nodePassword value flows from request to bcrypt

DFG paths are compared against spec-DFG edges to verify that the code satisfies its data contracts.

CDG: Control Dependence Graph

What decisions gate what execution — which conditions must hold for a code path to run.

Edge TypeFrom → ToExample
controlsguard → functionAuth middleware gates the route handler

CDG ancestors are compared against spec-CDG guards to verify that every required business rule is enforced.


Bridge Edges: Full Taxonomy

Bridge edges connect spec intent to code reality. They are what makes the graph "unified." The full taxonomy of bridge edge types:

Edge TypeFrom (code) → To (spec)MeaningHow Verified
implementscode_functionspec_operationThis function implements this operationStructural match: params, return type, DFG paths
exposescode_endpointspec_endpointThis handler exposes this API endpointRoute path + method match
persistscode_tablespec_entityThis table stores this entityColumn-to-field mapping + type match
satisfiescode_functionspec_user_storyThis code satisfies this story (coarse)E2E test passes for this story
teststest_casespec_acceptance_criterionThis test verifies this criterionTest file → criterion node link
validatestest_casecode_functionThis test exercises this functionImport/call analysis
drifts_fromcode_functionspec_operationImplementation has diverged from specStructural diff detected
realizesimpl_*spec_*This implementation plan realizes that spec intentExplicit linkage from impl planner
generated_fromcode_*impl_*This code was generated from this implementation planWritten by codegen pipeline on success

Bridge edges carry a confidence field (0.0–1.0) stored in the edge's metadata JSONB. Deterministic matches (exact route path) get 1.0. Inferred matches (function name similarity) get lower confidence. The convergence model uses these scores.


GraphQueryService

The GraphQueryService class at src/services/graph/graph-query-service.ts exposes typed methods over the two backing tables. Every query is deterministic SQL — no AI inference.

const gqs = new GraphQueryService(projectId, tenantId);

// Coverage queries
const gaps = await gqs.findUnimplementedSpec();     // spec nodes with no bridge edges
const coverage = await gqs.getSpecCoverage();       // { covered, total, percent }

// Impact queries
const blast = await gqs.computeBlastRadius(nodeId); // all transitively affected nodes

// Build order
const order = await gqs.computeTopologicalOrder();  // batches of impl nodes in dependency order

// Test coverage
const testCoverage = await gqs.getTestCoverage();   // acceptance criteria + guard coverage

Blast Radius Query (Example)

The blast radius traversal uses a recursive CTE over context_artifact_dependencies:

WITH RECURSIVE blast AS (
  SELECT artifact_id, depends_on_id, dependency_type, 1 AS depth
  FROM context_artifact_dependencies
  WHERE depends_on_id = :changed_node_id
    AND project_id = :project_id

  UNION ALL

  SELECT cad.artifact_id, cad.depends_on_id, cad.dependency_type, b.depth + 1
  FROM context_artifact_dependencies cad
  JOIN blast b ON cad.depends_on_id = b.artifact_id
  WHERE b.depth < :max_depth
    AND cad.project_id = :project_id
)
SELECT DISTINCT artifact_id, depth, dependency_type FROM blast;

This returns every node affected by a change, at what depth, and through what type of relationship. Computed in milliseconds. No LLM tokens consumed.


How the Graph Drives Code Generation

The graph determines what gets built and in what order — no planning agent guesswork required.

  1. Topological sort of impl nodes over context_artifact_dependencies produces batches: entities before services before endpoints, dependencies before dependents.

  2. For each spec node in topological order, the system reads its full implementation context from the graph: business rules, data flow contracts, guard conditions, upstream interface contracts.

  3. Generated code writes generated_from and implements edges back into the graph. Downstream generators receive the upstream interface contract as structured input — compatibility enforced without inference.

  4. The CEGIS verification loop checks each generated file structurally: does the code's DFG match the spec's Spec-DFG? Are all Spec-CDG guards present in the code's CDG? If not, a StructuralDiff is computed and fed back to the generator (up to 3 attempts).

  5. On success, the bridge edge is written, the spec coverage percentage rises, and the topological cursor advances.


Convergence

Convergence is a computed number, not a feeling:

const convergence = {
  structural:       specNodes.filter(n => hasBridgeEdge(n)).length / specNodes.length,
  behavioral:       assessments.filter(a => a.result === 'converged').length / assessments.length,
  foundationHealth: solidFoundations / totalFoundations,
  overall:          weighted(structural * 0.5, behavioral * 0.3, foundationHealth * 0.2)
};

The recursive convergence model verifies foundations before trusting anything built on top. If a Layer 1 entity node diverges, all nodes built on it get their effective confidence halved. The system reports root divergences ("3 root issues causing 15 failures") rather than flat lists.


Brownfield: Reverse Graph Extraction

For brownfield projects, the direction is reversed. Existing code is analyzed by CPG, producing a code graph. The extraction traversal maps:

  • Every code-DFG edge → candidate spec_entity / spec_operation data contract
  • Every code-CDG guard → candidate spec_guard business rule
  • Every code-EOG path from an HTTP handler → candidate spec_flow_step sequence

AI is used for one thing: naming and describing the extracted nodes in human language. The structural relationships are deterministic.

The client reviews the extracted spec, corrects what's wrong, and Praetor has a verified spec graph for a project that had no spec. From that point, the standard greenfield convergence pipeline applies.


Reference

DocumentLocation
Architecture overviewdocs/scale/ARCH-UNIFIED-PROJECT-GRAPH.md
Six-layer deep divedocs/scale/architecture/ARCHITECTURE-UNIFIED-PROJECT-GRAPH-DEEP-DIVE.md
Bidirectional mapping specdocs/scale/specs/SPEC-FULLMAP-001-bidirectional-mapping.md
Node table migrationsrc/db/migrations/194_context_artifacts.sql
Edge table migrationsrc/db/migrations/197_context_artifact_dependencies.sql
Edge type expansionsrc/db/migrations/427_context_artifact_dependency_types_expand.sql
GraphQueryServicesrc/services/graph/graph-query-service.ts
Graph layer detailsGraph Layers

Command Palette

Search for a command to run...