The problem

AI can write code fast — but on a real team that speed is dangerous without guardrails. Left to its own devices, an assistant skips reviews, merges half-finished work, writes code that ignores the existing architecture, and produces wildly different quality depending on who’s prompting. The hard problem isn’t getting AI to write code; it’s getting a team of AI agents to follow a trustworthy process every single time.

What it does

Odin gives a software team one shared, version-controlled way of working with AI across planning, building, and review. Instead of a single assistant improvising, it runs a coordinated team of specialised agents through a defined workflow — and it physically blocks the steps that shouldn’t happen: no merging before review, no skipping QA, no agent writing code it has no business writing. It installs into each project from one source of truth, so every team works the same way.

The hard part

The interesting engineering is about making an AI team trustworthy, not just capable:

  • The workflow is a state machine that can’t be talked out of. A change moves through Setup → Implementation → Review → QA → Merge, and the gates are enforced by hooks running outside the AI’s control — they block an invalid action (like merging early) before it can happen, and only advance when they detect real evidence the step is done (tests actually passing, a pull request actually opened).
  • Verify, don’t trust. The review gate doesn’t take the AI’s word that reviews happened — it queries GitHub directly and confirms each specialist genuinely posted their review, so a reviewer that silently failed can’t wave a release through. This is the piece I’m most proud of.
  • Adversarial review from five angles, in parallel. Every change is reviewed at once by separate specialists — architecture, security, infrastructure, QA, and UX — each required to leave concrete findings rather than a rubber-stamp approval.
  • Enforced role boundaries. Only the developer agents may write application code; the coordinators delegate but never touch it themselves. That stops capability creep and keeps, say, frontend patterns from bleeding into backend work.
  • Discipline encoded as artifacts, not vibes. How to write code (a strict signature-first, smallest-case-first test loop), what to read before touching a codebase, and how design decisions get recorded all live as canonical files the agents must follow — so quality doesn’t ride on a model’s mood that day.
  • Built for real scale. A large epic runs its stories in dependency order on an isolated branch, re-syncing with the main line between stories and halting loudly on a conflict instead of quietly diverging.

What makes it distinctive

  • It treats an AI workflow like production infrastructure: gates enforced by code the model can’t override, and review claims verified against the GitHub API rather than simply believed.
  • It encodes engineering discipline as shared artifacts (the gates, the test loop, the doc-grounding rules) so the process itself is the product — portable to every project that installs it.
  • One source of truth, distributed as a submodule + symlink, so improving the workflow once improves it everywhere.

Outcome

A team whose AI-assisted work is consistent, reviewed, and safe by default — no skipped reviews, no surprise merges, no architectural drift — and new projects or new people inheriting the whole proven process on day one. In active use across multiple teams. Open-source.

← all work