By the end of this chapter you can close the repository's quality loop with four more patterns — Review, Testing, CI-Doctor, and Refactoring — while keeping humans firmly on the merge decision. This is the second half of the Continuous-X library and the close of Part II.
Everything targets gh aw v0.81.6. The Repo Assistant graduates from tending the issue tracker to helping tend the code.
A repository's quality has a loop: code is proposed (a PR), reviewed, tested, merged, and — when something slips — fixed. Traditional CI automates the deterministic checks in that loop: does it compile, do the tests pass, does the linter approve. But the judgement steps — is this a good change? is this test worth adding? why did CI actually break? — still wait on a human.
Continuous-X patterns fill exactly those judgement gaps. Where Chapter 9 kept the inbox honest, these keep the codebase honest — each one a mini-product owning one link in the quality loop.
The one rule that makes it safe: humans keep the merge
The defining constraint of quality automation is that the agent proposes; a human disposes. A review agent comments, it doesn't approve. A test-improver opens a draft PR, it doesn't push to main. This isn't timidity — it's what lets you run these patterns at all. The agent accelerates the work up to the decision point and stops, leaving the irreversible call to a person. That's the human-in-the-loop principle, and it's why the safe-outputs boundary from Chapter 6 matters most here.
Four patterns, each triggered by a different moment in the quality loop — and each writing through a safe output that stops short of merging.
Pattern
Trigger
Writes via
Review
pull_request
submit-pull-request-review (COMMENT only)
Testing
schedule
create-pull-request (draft)
CI-Doctor
workflow_run (CI failed)
add-comment / create-issue
Refactoring
schedule or command
create-pull-request (draft)
Review: comment, never approve
The Review pattern reads a PR diff and leaves inline feedback. The critical setting is allowed-events: [COMMENT], which “prevents the agent from submitting APPROVE reviews regardless of what the agent attempts to output” — the docs explicitly recommend it as “the default for automated review workflows… without creating a persistent merge-blocking state” (Safe Outputs). Infrastructure enforces the human-keeps-the-merge rule.
CI-Doctor: react to the failure
CI-Doctor is the elegant use of the workflow_run trigger from Chapter 4 with conclusion filtering: fire only when a named CI workflow finishes with failure, read the logs, and post a diagnosis. Because workflow_run is hardened against cross-repo abuse, this stays safe even on public repos.
The CI-Doctor trigger — wake only on a real CI failure
on:
workflow_run:
workflows: ["CI"]
types: [completed]
conclusion: [failure] # only when CI actually broke
branches: [main]
Testing & Refactoring: propose a diff
Both run on a schedule, do focused work, and open a draftcreate-pull-request. Testing adds coverage without touching production code; Refactoring makes a small, behavior-preserving cleanup. Draft PRs keep the human on the merge, exactly as in the Docs pattern.
Quality patterns pay off when they act as a tireless first pass — catching the obvious before a human spends attention, never replacing the human's final say. The line to hold: automate the noticing and the drafting; reserve the deciding.
Agent may…
Human keeps…
comment on a PR, flag risks
approve / request changes / merge
open a draft test or refactor PR
review and merge that PR
diagnose a CI failure, file an issue
decide the fix and ship it
When not to
Don't let a review agent block merges. Auto REQUEST_CHANGES creates a persistent merge-blocking state from a fallible model. Keep allowed-events: [COMMENT] unless a human explicitly wants gating.
Don't let the test-improver edit production code. Instruct it to add tests only; a PR that “fixes” code to make a test pass is the opposite of what you want.
Don't auto-merge agent PRs. The draft PR is the human's decision point — automating the merge throws away the one safeguard that makes this safe.
Don't run a refactoring agent on a repo without good tests. “Behavior-preserving” is only verifiable if the tests can prove it. Ship Testing before Refactoring.
Two complementary quality agents: one reacts to every PR, the other proactively strengthens the tests. Both compile cleanly, and both stop short of the merge.
The review agent holds read-only PR access and can only emit a COMMENT review — the allowed-events setting makes “never block a merge” an infrastructural guarantee, not a hope. The test-improver gets a scoped shell to run the suite and edit to write tests, but its sole output is a draft PR a human reviews. Both accelerate the work right up to the human's decision, then hand it over.
Quality automation fills the judgement gaps CI can't: is this change good, is this test worth adding, why did CI break.
Four patterns — Review (PR, comment-only), Testing (scheduled draft PR), CI-Doctor (workflow_run on failure), Refactoring (scheduled draft PR).
The unbreakable rule is humans keep the merge — enforced by allowed-events: [COMMENT] and draft: true, not just by convention.
Automate the noticing and drafting; reserve the deciding. Ship Testing before Refactoring, and never auto-merge an agent's PR.
What's next. You now have a shelf of patterns — and you're about to notice how much they repeat. Part III scales from one repo to an org. Chapter 11: Reuse & Memory factors the shared parts into imported components and gives the Repo Assistant memory that persists across runs.