Three components and a recursive enforcement loop that ensures no skill can complete without proving it did.
Three components
A ## Definition of Done section lives inside every skill's file. Each criterion must be specific, verifiable, complete, and atomic. The quality contract is part of the skill itself — not a separate spec that can drift out of sync.
A DoD that exists but is vague is worse than no DoD — it creates false confidence. The auditor checks both presence and quality. A DoD is suspect if it uses vague language ("complete", "adequate"), lacks a clear pass/fail condition, has fewer than 3 criteria, or all criteria check the same narrow thing.
The universal quality gate. Called by every skill as its last step. Reads the calling skill's DoD, checks each criterion against actual state, and returns PASS or FAIL with a gap list back to the caller. Never prompts the user unless it cannot resolve the problem itself.
If the DoD is missing or suspect, the auditor calls review-definition-of-done silently before proceeding. The user never sees either case unless the reviewer itself fails.
flowchart TD
A([audit-workflow-deliverables called]) --> B[Locate calling skill SKILL.md]
B --> C{DoD exists and not suspect?}
C -->|No| D[Call review-definition-of-done]
D --> E{DoD fixed successfully?}
E -->|Yes| C
E -->|No| F([Escalate to user])
C -->|Yes| G[Read DoD criteria]
G --> H[Check each criterion against actual state]
H --> I{All criteria met?}
I -->|Yes| J([Report PASS to caller])
I -->|No| K([Report FAIL and gap list to caller])
The meta-skill. Called silently by the auditor when a DoD is missing or suspect. Creates or rewrites it to meet the four quality tests. Has its own DoD — held to the same standard it enforces.
This is the self-similar part: the quality system applies to itself. The meta-skill is not exempt — it must satisfy the very standard it exists to enforce.
flowchart TD
A([review-definition-of-done called]) --> B[Read target skill DoD section]
B --> C{DoD exists?}
C -->|No| D[Draft DoD from skill body content]
C -->|Yes| E{DoD meets quality criteria?}
D --> F[Write DoD section to skill file]
E -->|No| G[Rewrite or improve DoD section]
E -->|Yes| H([DoD is good])
F --> H
G --> H
The enforcement loop
flowchart TD
A([Any Skill Invoked]) --> B[Execute skill steps]
B --> C[Call audit-workflow-deliverables]
C --> D{DoD exists and not suspect?}
D -->|No| E[Call review-definition-of-done]
E --> F[DoD created or fixed]
F --> C
D -->|Yes| G{All DoD criteria met?}
G -->|Yes| H([Skill Complete])
G -->|No| I{Attempt <= 3?}
I -->|Yes| B
I -->|No| J([Escalate to user])
Key properties
Sub-skills audit before returning. Orchestrators audit after all sub-skills complete. The quality contract is present at every level of the call stack.
Every skill retries up to 3 times on audit failure before escalating. Consistent across the entire ecosystem. One threshold to know.
The auditor does not audit itself. This terminates the recursion. Its quality is maintained by being narrow and simple enough to verify by inspection.
The audit call lives in the skill body, not in framework plumbing. Anyone reading the skill sees the quality commitment explicitly in the procedure.
A vague DoD is worse than no DoD — it creates false confidence. The auditor checks both presence and quality before trusting the criteria.
The meta-skill has its own DoD. The pattern applies to itself. Every component in the ecosystem meets the standard it enforces.
What it prevents
Skills that declare success without verifying their outputs.
Missing files, unconfigured remotes, empty folders that look fine on the surface.
Definition of Done sections that become vague or untestable over time. Caught by the "not suspect" check before the audit runs.
Orchestrators that forget to call the auditor. Prevented by baking the audit call into every skill and enforcing it at write time.