Unified Project Graph
The bidirectional, deterministically queryable graph that connects every spec artifact to every code artifact through typed, traversable edges.
Unified Project Graph
The Unified Project Graph (UPG) is Praetor's central competitive advantage. It is a six-layer typed knowledge graph where every meaningful artifact in a software project — from a business requirement down to a deployed infrastructure service — is a node, every meaningful relationship is a typed edge, and the cross-layer connections between intent and implementation are maintained continuously.
Every interesting question about a codebase — "what does this function affect?", "is this spec implemented?", "what breaks if I change this?" — is answered by a deterministic SQL graph traversal. No AI inference. No probabilistic matching. No guessing.
Backing Tables
The entire graph lives in two PostgreSQL tables.
context_artifacts — Nodes
Every node in the graph is a row in context_artifacts (migration 194, extended by migrations 362 and 427).
| Column | Type | Purpose |
|---|---|---|
id | UUID | Primary key |
tenant_id | UUID | Row-level security isolation |
project_id | UUID | Which project owns this node |
artifact_type | TEXT | The layer and type of the node (e.g., spec_entity, code_function, test_case) |
artifact_key | TEXT | Stable human-readable identifier for upsert semantics (e.g., spec:entity:User) |
name | TEXT | Display name |
content | TEXT | Full content (spec prose, code excerpt, or structured JSON) |
token_count | INTEGER | Pre-computed token count for context budget management |
version | INTEGER | Version number; incremented on each update |
layer | TEXT | Layer discriminator for fast filtering |
priority | INTEGER | Inclusion priority for context assembly (1–10) |
tags | TEXT[] | Free-form tags for filtering |
metadata | JSONB | Type-specific metadata (fields, guards, data flow hints, confidence scores) |
is_active | BOOLEAN | Soft-delete flag |
is_stale | BOOLEAN | Set by the cascade system when an upstream artifact changes |
source_table | TEXT | Provenance: which DB table produced this artifact |
source_id | TEXT | Provenance: which row in source_table produced this artifact |
embedding | vector(1536) | Optional pgvector embedding for semantic search |
expires_at | TIMESTAMPTZ | Optional TTL for ephemeral artifacts |
created_at | TIMESTAMPTZ | Creation timestamp |
updated_at | TIMESTAMPTZ | Last-modified timestamp (auto-maintained by trigger) |
The artifact_key column has a unique index on (project_id, artifact_type, artifact_key), enabling upsert-based registration: re-ingesting the same spec element always resolves to the same node.
Row-level security is enforced via app.tenant_id. All reads and writes must pass through withTenant().
context_artifact_dependencies — Edges
Every edge in the graph is a row in context_artifact_dependencies (migration 197, extended by migrations 422 and 427).
| Column | Type | Purpose |
|---|---|---|
id | UUID | Primary key |
artifact_id | UUID FK | The dependent node (the one that needs the other to exist first) |
depends_on_id | UUID FK | The prerequisite node |
dependency_type | TEXT | The typed relationship label — constrained to the full taxonomy below |
metadata | JSONB | Edge metadata: confidence score (0.0–1.0), condition labels, diff snapshots |
created_at | TIMESTAMPTZ | Creation timestamp |
The direction convention is: depends_on_id is the node that must exist first; artifact_id is the dependent. For bridge edges, the convention is reversed by naming: a code_function row's artifact_id holds the code node and depends_on_id holds the spec node it implements.
The dependency_type column has a CHECK constraint that enumerates every valid edge type. Attempting to write an unrecognized edge type is a database-level error, not a silent no-op.
The Six Layers
The UPG distinguishes six layers by the artifact_type prefix. The deep-dive below covers each layer's node types and their purpose.
graph TB
subgraph "Layer 1 — Business Intent (spec_*)"
L1["spec_project<br/>spec_bounded_context<br/>spec_epic<br/>spec_user_story<br/>spec_acceptance_criterion<br/>spec_constraint<br/>spec_persona"]
end
subgraph "Layer 2 — System Structure (spec_*)"
L2["spec_entity · spec_field<br/>spec_service · spec_operation<br/>spec_endpoint · spec_schema · spec_guard"]
end
subgraph "Layer 3 — Behavior (spec_*)"
L3["spec_flow_step · spec_workflow · spec_workflow_step<br/>spec_ui_page · spec_ui_component<br/>spec_ui_state · spec_ui_layout"]
end
subgraph "Layer 4 — Implementation Plan (impl_*)"
L4["impl_entity · impl_service · impl_operation<br/>impl_endpoint · impl_guard · impl_workflow<br/>impl_ui_page · impl_ui_component"]
end
subgraph "Layer 5 — Code (code_*)"
L5["code_module · code_class · code_function<br/>code_endpoint · code_table · code_schema<br/>code_component · code_route"]
end
subgraph "Layer 6 — Verification & Infrastructure (test_* · infra_*)"
L6["test_suite · test_case · test_assertion · test_fixture<br/>infra_service · infra_dependency · infra_config"]
end
L1 -->|"realizes (impl)"| L4
L2 -->|"realizes (impl)"| L4
L3 -->|"realizes (impl)"| L4
L4 -->|"generated_from"| L5
L5 <-->|"implements / drifts_from"| L2
L5 -->|"tests"| L6Layer 1: Business Intent (spec_*)
What the product is supposed to do. Populated by the 6-phase elicitation engine.
| Node Type | What It Represents | Example |
|---|---|---|
spec_project | The project itself | "ClientHub CRM" |
spec_bounded_context | A domain boundary | "Customer Management", "Billing" |
spec_epic | A major feature group | "User Authentication" |
spec_user_story | A user-facing capability | "As a sales rep, I want to search contacts by company" |
spec_acceptance_criterion | A testable condition | "Search returns results within 200ms" |
spec_constraint | A non-functional requirement | "Must comply with GDPR data residency" |
spec_persona | A user type | "Sales Representative", "Admin" |
Layer 2: System Structure (spec_* continued)
How the system is architecturally organized. Derived from discovery and domain analysis.
| Node Type | What It Represents | Example |
|---|---|---|
spec_entity | A domain object | "Customer", "Order", "Invoice" |
spec_field | A property of an entity | "Customer.email (string, required, unique)" |
spec_service | A logical service boundary | "AuthService", "OrderService" |
spec_operation | An action a service performs | "AuthService.register(email, password) → User" |
spec_endpoint | An API surface | "POST /api/auth/register" |
spec_schema | A data shape / validation | RegisterRequest { email: string, password: string } |
spec_guard | A security/business rule | "requireAuth", "requireRole('admin')" |
Layer 3: Behavior (spec_* continued)
How the system behaves — journeys, workflows, conditional logic.
| Node Type | What It Represents | Example |
|---|---|---|
spec_flow_step | A step in a user journey | "User fills in registration form" |
spec_workflow | An async multi-step process | "Order fulfillment pipeline" |
spec_workflow_step | A step in a workflow | "Validate payment → Reserve inventory → Ship" |
spec_ui_page | A page in the application | "/dashboard", "/customers/:id" |
spec_ui_component | A UI component | "CustomerTable", "OrderForm" |
spec_ui_state | A UI state condition | "loading", "error", "empty" |
spec_ui_layout | A layout container | "DashboardLayout with sidebar" |
Layer 4: Implementation Plan (impl_*)
The bridge between "what" (spec) and "how" (code). Contains all decisions about file paths, imports, types, and patterns. Populated by the implementation plan generator.
| Node Type | What It Represents | Example |
|---|---|---|
impl_entity | Implementation plan for an entity | "customers table at src/db/schema/customers.ts using Drizzle" |
impl_service | Implementation plan for a service | "AuthService at src/services/auth/auth-service.ts" |
impl_operation | Implementation plan for an operation | "register() imports hashPassword from @/lib/crypto, validates with Zod" |
impl_endpoint | Implementation plan for a route | "POST /api/auth/register at src/routes/auth.ts using Hono" |
impl_guard | Implementation plan for a guard | "requireAuth middleware at src/middleware/auth.ts" |
impl_workflow | Implementation plan for a workflow | "Inngest function at src/inngest/order-fulfillment.ts" |
impl_ui_page | Implementation plan for a page | "Next.js page at app/customers/[id]/page.tsx" |
impl_ui_component | Implementation plan for a component | "CustomerTable component using shadcn DataTable" |
Layer 5: Code (code_*)
What was actually built. Populated by the brownfield CPG pipeline (for existing code) or by the codegen pipeline (for generated code).
| Node Type | What It Represents | Example |
|---|---|---|
code_module | A source file | "src/services/auth/auth-service.ts" |
code_class | A class definition | "AuthService class" |
code_function | A function or method | "register(email: string, password: string)" |
code_endpoint | An HTTP route handler | "POST /api/auth/register handler" |
code_table | A database table | "customers table (Drizzle schema)" |
code_schema | A validation schema | "RegisterRequestSchema (Zod)" |
code_component | A UI component | "CustomerTable React component" |
code_route | A frontend route | "/customers/:id Next.js page" |
Layer 6: Verification & Infrastructure (test_*, infra_*)
What proves correctness and what the system runs on.
| Node Type | What It Represents | Example |
|---|---|---|
test_suite | A test file | "auth-service.test.ts" |
test_case | A single test | "should hash password before storing" |
test_assertion | A specific check | "expect(user.password).not.toBe(plaintext)" |
test_fixture | Test data | "testUser fixture with known credentials" |
infra_service | A deployed service | "postgres container on Railway" |
infra_dependency | An external dependency | "Stripe API, OpenRouter" |
infra_config | Configuration | "DATABASE_URL environment variable" |
The Four Spec Graph Edge Layers
The spec graph has four edge-type layers, directly analogous to the four CPG layers on the code side.
Spec-AST: Structural Containment
What contains what. The hierarchy of the spec.
| Edge Type | From → To | Example |
|---|---|---|
contains | parent → child | spec_epic → spec_user_story |
has_field | entity → field | spec_entity "Customer" → spec_field "email" |
has_operation | service → operation | spec_service "AuthService" → spec_operation "register" |
requires | dependent → dependency | spec_operation "createOrder" → spec_entity "Customer" |
Spec-Flow: Journey Sequences
The ordered sequence of steps in a user journey. This is the spec-side equivalent of the code EOG — what is supposed to happen, in what order, with what branches.
| Edge Type | From → To | Example |
|---|---|---|
flow_next | step → next step | "Fill form" → "Submit" → "See confirmation" |
flow_branch | step → conditional step | "Submit" → "Show error" (condition: validation fails) |
If the code's EOG traversal of a handler doesn't match this Spec-Flow sequence, the difference is computable as a graph diff — not a judgment call.
Spec-DFG: Intended Data Movement
Where data is supposed to flow between components. Defines intended contracts for the code DFG to satisfy.
| Edge Type | From → To | Example |
|---|---|---|
data_intends_to_flow | source → destination | spec_entity "Order" → spec_operation "calculateTotal" |
Missing spec-DFG edge in the code graph = gap. Unspecified code-DFG edge = potential unintended behavior.
Spec-CDG: Business Rules as Guard Conditions
Business constraints as explicit control dependence. These become required CDG ancestors in the generated code.
| Edge Type | From → To | Example |
|---|---|---|
guards | guard → protected operation | spec_guard "requireAuth" → spec_endpoint "POST /orders" |
These are verifiable requirements: in the generated code, each guarded operation must have a CDG ancestor node that implements the stated condition. If it doesn't, it is a specification violation.
The Four CPG Code Graph Layers
The code-side graph comes from CPG analysis (Fraunhofer CPG library, Tree-sitter, Semgrep) of the actual codebase.
AST: Structural Containment
Static relationships between code elements — what calls what, what imports what, what extends what.
| Edge Type | From → To | Example |
|---|---|---|
references | function → function | register() calls hashPassword() |
imports | module → module | auth-service.ts imports crypto.ts |
extends | class → class | AdminService extends UserService |
EOG: Execution Order Graph
Which code executes after what — control flow paths, branches, loops.
| Edge Type | From → To | Example |
|---|---|---|
controls_flow_to | function → function | Execution path from handler to service |
EOG paths are compared against Spec-Flow sequences to detect behavioral drift.
DFG: Data Flow Graph
Where values come from and where they go — the actual data movement through the code.
| Edge Type | From → To | Example |
|---|---|---|
data_flows_to | node → node | Password value flows from request to bcrypt |
DFG paths are compared against spec-DFG edges to verify that the code satisfies its data contracts.
CDG: Control Dependence Graph
What decisions gate what execution — which conditions must hold for a code path to run.
| Edge Type | From → To | Example |
|---|---|---|
controls | guard → function | Auth middleware gates the route handler |
CDG ancestors are compared against spec-CDG guards to verify that every required business rule is enforced.
Bridge Edges: Full Taxonomy
Bridge edges connect spec intent to code reality. They are what makes the graph "unified." The full taxonomy of bridge edge types:
| Edge Type | From (code) → To (spec) | Meaning | How Verified |
|---|---|---|---|
implements | code_function → spec_operation | This function implements this operation | Structural match: params, return type, DFG paths |
exposes | code_endpoint → spec_endpoint | This handler exposes this API endpoint | Route path + method match |
persists | code_table → spec_entity | This table stores this entity | Column-to-field mapping + type match |
satisfies | code_function → spec_user_story | This code satisfies this story (coarse) | E2E test passes for this story |
tests | test_case → spec_acceptance_criterion | This test verifies this criterion | Test file → criterion node link |
validates | test_case → code_function | This test exercises this function | Import/call analysis |
drifts_from | code_function → spec_operation | Implementation has diverged from spec | Structural diff detected |
realizes | impl_* → spec_* | This implementation plan realizes that spec intent | Explicit linkage from impl planner |
generated_from | code_* → impl_* | This code was generated from this implementation plan | Written by codegen pipeline on success |
Bridge edges carry a confidence field (0.0–1.0) stored in the edge's metadata JSONB. Deterministic matches (exact route path) get 1.0. Inferred matches (function name similarity) get lower confidence. The convergence model uses these scores.
GraphQueryService
The GraphQueryService class at src/services/graph/graph-query-service.ts exposes typed methods over the two backing tables. Every query is deterministic SQL — no AI inference.
const gqs = new GraphQueryService(projectId, tenantId);
// Coverage queries
const gaps = await gqs.findUnimplementedSpec(); // spec nodes with no bridge edges
const coverage = await gqs.getSpecCoverage(); // { covered, total, percent }
// Impact queries
const blast = await gqs.computeBlastRadius(nodeId); // all transitively affected nodes
// Build order
const order = await gqs.computeTopologicalOrder(); // batches of impl nodes in dependency order
// Test coverage
const testCoverage = await gqs.getTestCoverage(); // acceptance criteria + guard coverageBlast Radius Query (Example)
The blast radius traversal uses a recursive CTE over context_artifact_dependencies:
WITH RECURSIVE blast AS (
SELECT artifact_id, depends_on_id, dependency_type, 1 AS depth
FROM context_artifact_dependencies
WHERE depends_on_id = :changed_node_id
AND project_id = :project_id
UNION ALL
SELECT cad.artifact_id, cad.depends_on_id, cad.dependency_type, b.depth + 1
FROM context_artifact_dependencies cad
JOIN blast b ON cad.depends_on_id = b.artifact_id
WHERE b.depth < :max_depth
AND cad.project_id = :project_id
)
SELECT DISTINCT artifact_id, depth, dependency_type FROM blast;This returns every node affected by a change, at what depth, and through what type of relationship. Computed in milliseconds. No LLM tokens consumed.
How the Graph Drives Code Generation
The graph determines what gets built and in what order — no planning agent guesswork required.
-
Topological sort of impl nodes over
context_artifact_dependenciesproduces batches: entities before services before endpoints, dependencies before dependents. -
For each spec node in topological order, the system reads its full implementation context from the graph: business rules, data flow contracts, guard conditions, upstream interface contracts.
-
Generated code writes
generated_fromandimplementsedges back into the graph. Downstream generators receive the upstream interface contract as structured input — compatibility enforced without inference. -
The CEGIS verification loop checks each generated file structurally: does the code's DFG match the spec's Spec-DFG? Are all Spec-CDG guards present in the code's CDG? If not, a
StructuralDiffis computed and fed back to the generator (up to 3 attempts). -
On success, the bridge edge is written, the spec coverage percentage rises, and the topological cursor advances.
Convergence
Convergence is a computed number, not a feeling:
const convergence = {
structural: specNodes.filter(n => hasBridgeEdge(n)).length / specNodes.length,
behavioral: assessments.filter(a => a.result === 'converged').length / assessments.length,
foundationHealth: solidFoundations / totalFoundations,
overall: weighted(structural * 0.5, behavioral * 0.3, foundationHealth * 0.2)
};The recursive convergence model verifies foundations before trusting anything built on top. If a Layer 1 entity node diverges, all nodes built on it get their effective confidence halved. The system reports root divergences ("3 root issues causing 15 failures") rather than flat lists.
Brownfield: Reverse Graph Extraction
For brownfield projects, the direction is reversed. Existing code is analyzed by CPG, producing a code graph. The extraction traversal maps:
- Every code-DFG edge → candidate
spec_entity/spec_operationdata contract - Every code-CDG guard → candidate
spec_guardbusiness rule - Every code-EOG path from an HTTP handler → candidate
spec_flow_stepsequence
AI is used for one thing: naming and describing the extracted nodes in human language. The structural relationships are deterministic.
The client reviews the extracted spec, corrects what's wrong, and Praetor has a verified spec graph for a project that had no spec. From that point, the standard greenfield convergence pipeline applies.
Reference
| Document | Location |
|---|---|
| Architecture overview | docs/scale/ARCH-UNIFIED-PROJECT-GRAPH.md |
| Six-layer deep dive | docs/scale/architecture/ARCHITECTURE-UNIFIED-PROJECT-GRAPH-DEEP-DIVE.md |
| Bidirectional mapping spec | docs/scale/specs/SPEC-FULLMAP-001-bidirectional-mapping.md |
| Node table migration | src/db/migrations/194_context_artifacts.sql |
| Edge table migration | src/db/migrations/197_context_artifact_dependencies.sql |
| Edge type expansion | src/db/migrations/427_context_artifact_dependency_types_expand.sql |
| GraphQueryService | src/services/graph/graph-query-service.ts |
| Graph layer details | Graph Layers |