From Aegis to SpecChain: When Governance Meets Reality
I built a Constitutional AI Governance Framework. Thirteen articles. HMAC attestations. Democratic amendment processes. Every validation function returned hardcoded perfection. The governance thinking was real. The code was theater. Here's what survived the extraction.
Nino Chavez
Product Architect at commerce.com
Every validation function returned hardcoded perfection. calculateVersionConsistency() — always 1.0. detectViolations() — empty array, every time. Ten functions in a file called aegis-conductor.ts, each returning static values, none checking anything real.
The framework they belonged to had a constitution. Thirteen articles. A manifesto declaring it “an operating system approach for AI-assisted engineering.” HMAC attestation signatures. A democratic amendment process. 28,000 bytes of governance specification.
Aegis was the most thoroughly documented non-functional software I’ve ever written.
The Ambition
The premise was compelling. AI agents generate code, but nobody governs how they do it. Different agents produce different patterns. Standards drift. Quality is inconsistent. What if you could create a constitutional layer—a set of enforceable principles that every agent operates under?
Aegis was the answer. Or it was supposed to be.
The constitution covered traceability, observability, reproducibility, semantic versioning, change classification with three tiers. The specification described a Constitutional Kernel, a Governance Control Plane, and Execution Units that would enforce patterns in real time. Pre-generation validation. Post-generation compliance checks. Drift detection across agent interactions.
The manifesto had lines like: “A perfect prompt is still a prayer to a probability machine.” I still think that’s true.
Reading the docs, you’d think this was a serious infrastructure project. Multiple versions. Elaborate architecture diagrams. The attest tool had real HMAC crypto code — createHmac('sha256') — that would sign AI-generated artifacts. That part actually worked.
The Reality
Here’s what aegis-conductor.ts actually did:
private async calculateVersionConsistency(
references: string[]
): Promise<number> {
return 1.0;
}
private async detectViolations(
scope: string
): Promise<ConstitutionalViolation[]> {
return [];
}
private async validateBlueprints(
files: string[]
): Promise<{ compliance: number; valid: number; invalid: number }> {
return { compliance: 1.0, valid: 0, invalid: 0 };
}
Perfect consistency. Zero violations. Full compliance. Every time. Hardcoded.
Ten functions in the conductor alone returned static values. findFilesWithoutAnnotations() returned an empty array—no files missing annotations because nobody checked. analyzeDrift() returned { patterns: [], severity: 'low' }. No drift detected because drift detection didn’t exist.
The compliance estimator in aegis-hydrate.ts had a comment that said it all: return 85; // Placeholder - would parse actual output.
Would. The conditional tense of software that never arrived.
How This Happens
It’s easy to judge this in retrospect. But the failure mode is instructive because it isn’t unique to me.
Aegis was my first serious AI-assisted project. I was learning how to collaborate with agents on code generation. And I made a specific mistake: I let the agent’s fluency with documentation mask its inability to build functional systems.
The AI was excellent at writing constitutions. It could produce elaborate governance architectures with plausible-sounding components. Constitutional Telemetry. Constitutional Immune System. Constitutional Self-Healing. Each concept sounded right. Each had a specification. None had working code behind it.
The documentation grew because documentation was easy to produce. The code stalled because the actual problems—parsing real codebases, detecting real drift, enforcing real constraints—were hard. And the more elaborate the documentation became, the more it felt like progress.
What Was Worth Saving
But here’s the thing. The governance thinking in Aegis was genuinely useful.
Somewhere underneath thirteen constitutional articles and the elaborate enforcement machinery, there were six ideas that I kept reaching for in other projects:
-
Scope minimization. Prefer the smallest viable change. Aegis had three tiers for this—MVP-Fix, Surgical Refactor, Systemic Change. I’d been applying that ranking intuitively for months before realizing I’d internalized it from the constitution.
-
Behavioral contracts. Assert what happens, not how it happens. “User is redirected to /dashboard after login,” not “code calls
router.push.” This principle survived every tool transition I went through. -
Traceability. AI-generated code should be identifiable. Decisions should have documented rationale. This doesn’t require HMAC attestations. It requires commit messages and inline comments.
-
Boundary validation. Validate at system edges—user input, external APIs. Trust internal code. Aegis buried this principle under elaborate enforcement layers, but the principle itself is just good engineering.
-
Graceful degradation. Fail safely. No stack traces in user-facing errors. Fallback strategies for non-critical failures.
-
Observability. Log at operation boundaries. Structured logging. Correlation IDs for cross-service operations.
None of these are novel. Engineers have known them for decades. But having them codified—as principles, not as a constitutional legal framework—turns out to be useful when you’re working with AI agents that need explicit guidance about what matters.
The Path Between
Aegis wasn’t the only experiment. Between building it and arriving at SpecChain, I worked through several spec-driven tools.
The agent-os concept tried to create a full operating system for AI agents—execution modes, governance enforcement, the works. Similar ambition to Aegis, similar gap between specification and reality.
The docs toolkit succeeded where Aegis failed, partly because it was scoped tightly. Nine skills, each doing one thing. No constitution. No manifesto. Just tools that work.
SpecChain proved that spec-driven development works when the specs are contracts, not aspirations. Write what you want. Assign specialists. Verify against the spec. Persist decisions. Simple machinery. Functional machinery.
Each iteration taught the same lesson: the value isn’t in the framework’s ambition. It’s in what you can actually run.
The Extraction
So last week I did something I probably should have done months ago. I went back to Aegis, not to resurrect it, but to extract from it.
The goal was specific: add a governance layer to SpecChain. Not a constitutional operating system. Not enforcement machinery. Just the principles that survived every tool transition, packaged as starter templates that get generated when you set up a new project.
Three files:
| File | What It Does | Lines |
|---|---|---|
governance/principles.md | The six principles, distilled | 74 |
governance/claude-md.tmpl | CLAUDE.md starter template | 109 |
governance/cursorrules.tmpl | .cursorrules starter template | 47 |
Plus a modification to setup.sh that asks during installation: “Generate CLAUDE.md and .cursorrules?” If yes, prompts for project name, description, language, framework. Substitutes values into templates. Writes the files. Done.
The entire governance layer is 230 lines. Aegis’s constitution alone was 28,000 bytes.
What the Templates Actually Do
The CLAUDE.md template encodes the consistent pattern I found across sixteen existing CLAUDE.md files in my workspace. Every project had independently converged on the same sections: tech stack table, key directories, exclusions list, commands reference. The template just makes that structure explicit from day one.
The .cursorrules template takes the six governance principles and renders them as rules an AI assistant can follow. Plus file layout, commands, and a deny list—patterns that should never appear in generated code.
Neither template is a framework. They’re starters. You run setup, you get files with your project name filled in and sensible defaults, and then you edit them. The governance principles are reference material, not enforcement machinery.
This is the key difference from Aegis. Aegis tried to enforce governance through validation functions that would catch violations in real time. SpecChain’s governance layer just puts the principles where agents will read them—in CLAUDE.md and .cursorrules, the files that AI assistants load at the start of every session.
No enforcement code. No compliance scores. No drift detection. Just well-placed documentation that shapes behavior through context, not control.
The Uncomfortable Lesson
Aegis failed as software. But it succeeded as a thinking exercise.
The six principles I extracted have been more useful as a Markdown file in a governance directory than they ever were as a constitutional framework with hardcoded compliance functions. Because the principles were never the problem. The delivery mechanism was.
I spent months building enforcement machinery for ideas that only needed to be written down and put in the right place. CLAUDE.md is the enforcement mechanism. Not because it validates anything, but because it’s the first thing every AI agent reads. Put your principles there and they shape every interaction. No conductor required.
The best governance framework for AI agents turned out to be a well-written Markdown file in the right directory.
What This Changes
SpecChain now handles two things: how features get built (specs, tasks, verification) and how projects get governed (principles, templates, standards). The governance layer is lightweight by design. It’s not trying to be Aegis.
What I’m watching:
- Template evolution. As I use these templates across more projects, the default sections will sharpen. Domain terminology tables might get pre-populated based on tech stack detection. The deny list might grow from community feedback.
- Principle pressure-testing. Six principles is a hypothesis. After running real projects under them for a few months, some might prove too abstract. Others might split into more specific guidance.
- The gap between context and control. Putting principles in CLAUDE.md works because current AI agents are responsive to context. If agent architectures change—more autonomy, less context sensitivity—the approach might need to evolve.
I’m not declaring the governance problem solved. I’m declaring that the Aegis approach—elaborate enforcement machinery—was the wrong direction. And that the SpecChain approach—principles as context, not control—is working well enough to ship.
The Arc
It’s worth naming what happened across these projects:
- Aegis — Ambitious governance framework. Thirteen constitutional articles. Zero functional enforcement. The thinking was real. The code was theater.
- Agent-OS — Tried to operationalize governance. Learned that execution modes and enforcement layers add complexity faster than they add value.
- Docs Toolkit — Proved that scoped, single-purpose tools beat comprehensive frameworks. Nine skills. Each one works.
- SpecChain — Proved that spec-driven development works when specs are contracts. Now carries the governance principles forward as templates, not enforcement.
The pattern across all four: ambition contracts, pragmatism expands. Each iteration got smaller in scope and larger in actual utility.
I’d rather have 230 lines of governance templates that get used on every project than 28,000 bytes of constitution that returns hardcoded perfection.
This is the seventh entry in the Agentic Workflows in Practice series. Not demos. Not theory. Real work, documented as it happens.