By the end of this chapter you can cap, meter, and gate agentic spend — with max-ai-credits per-workflow budgets and org-wide defaults — and set the policy that keeps a fleet affordable and compliant as it grows.
Everything targets gh aw v0.81.6. We put a hard budget on the Repo Assistant and show how an org enforces the same limits across every repo at once.
CI/CD costs are largely fixed: a build takes roughly the same compute every time. Agentic work is different — each run spends a variable amount of model inference depending on how much the agent reads, reasons, and retries. That variability is the whole reason agents are powerful, and it's also why agentic work has a budget in a way CI never did. Left unbounded, a looping agent or an over-eager schedule can quietly run up real money.
At one repo, this is a cost knob. Across an org, it becomes a policy surface: which workflows may run, which capabilities they may use, what they may spend, which model they default to. FinOps — the discipline of managing variable cloud spend — now applies to your agents, and governance means answering these questions once, centrally, not per repo.
Cost is denominated in AI Credits (AIC), a model-normalized unit so budgets mean the same thing regardless of engine. You control it at two levels.
Per-workflow budgets
max-ai-credits “sets the AWF AI Credits budget used for cost enforcement. It is enabled by default and defaults to 1000 (1k) when omitted” — with steering messages at 80%, 90%, 95%, and 99% of budget (Frontmatter). Its sibling max-daily-ai-credits caps a rolling 24-hour total across recent runs of the same workflow; when exceeded it “warns, creates an issue, skips the agent job” (Frontmatter) — a fail-safe, not a silent overspend.
Three cost dials in the frontmatter
max-ai-credits: 200 # per-run budget (default 1000); K/M suffixes ok
max-daily-ai-credits: 2000 # rolling 24h cap across this workflow's runs
timeout-minutes: 10 # wall-clock ceiling (default 20)
on:
stop-after: "+30d" # stop triggering after a deadline (Ch. 4)
Token efficiency is the other half: a tighter prompt, a narrower toolset, and read-only scopes all reduce credits per run. The cheapest run is the one that reads only what it needs — good security and good FinOps are the same discipline.
Org-wide defaults and policy
Editing every workflow doesn't scale. gh aw env manages GH_AW_DEFAULT_* variables at repository, organization, or enterprise scope from a YAML file (Governance):
defaults.yml — org-wide guardrails, applied without touching workflows
Values percolate with a clear precedence: “workflow frontmatter value… repository variable… organization variable… enterprise variable… built-in compiler fallback” (Governance). Beyond numbers, policy variables (GH_AW_POLICY_*) “enforce capability gates… without recompiling any workflow” — for instance GH_AW_POLICY_ALLOW_CREATE_PULL_REQUEST=false makes the safe-outputs server refuse to start for any workflow that tries to open PRs, org-wide.
Budgets trade off cost certainty against task completion: too tight and useful runs get cut off; too loose and a bad run overspends. Tune to the shape of the work.
Don't disable budgets (max-ai-credits: -1) to “unblock” a workflow. A run hitting its cap is usually a looping or over-scoped agent — fix the cause; the budget did its job.
Don't set org defaults so tight that every repo overrides them. If exceptions become the norm, the baseline is wrong. The goal is most repos aligned, few exceptions.
Don't rely on budgets for security. A budget limits spend, not blast radius — that's still safe outputs, the firewall, and policy gates (Chapters 6–8). Cost controls and security controls are complementary.
Don't govern by editing workflows. At scale, prefer gh aw env defaults and GH_AW_POLICY_* gates — central, reviewable, and applied without recompiling every repo.
Here's the Repo Assistant with a real budget — capped three ways and set to expire — the version an org would be happy to run at scale.
examples/ch13/repo-assistant-budgeted.md — cost controls in frontmatter (compiles: 0/0)
on:
issues: { types: [opened] }
schedule: daily
workflow_dispatch:
stop-after: "+30d" # stop triggering after a month
permissions: { contents: read, issues: read }
engine: copilot
network: { allowed: [defaults, github] }
max-ai-credits: 200 # tight per-run budget (default is 1000)
max-daily-ai-credits: 2000 # rolling 24h cap across runs
timeout-minutes: 10 # wall-clock ceiling
safe-outputs:
add-comment: { max: 1 }
add-labels:
allowed: [bug, enhancement, question, documentation]
max: 1
Four independent cost brakes: a per-run credit budget of 200, a rolling daily cap of 2000, a wall-clock ceiling, and a calendar expiry. If a single run misbehaves, max-ai-credits stops it; if the whole day runs hot, max-daily-ai-credits warns, files an issue, and skips further agent runs. None of these require a human watching a dashboard.
Now make it org-compliant without editing this file at all. An admin sets baseline defaults and a capability policy once:
Org-wide governance — applied to every repo, no workflow edits
# defaults.yml, applied at org scope
gh aw env update defaults.yml --scope org --org my-org --dry-run
gh aw env update defaults.yml --scope org --org my-org
# forbid a capability fleet-wide, no recompile needed
gh variable set GH_AW_POLICY_ALLOW_CREATE_PULL_REQUEST --org my-org --body "false"
You can now keep an agent fleet affordable and compliant:
Agentic work has a variable cost and, at scale, a policy surface — FinOps and governance now apply to your agents.
Per-workflow: max-ai-credits (default 1000, steering at 80/90/95/99%), max-daily-ai-credits (fail-safe daily cap), timeout-minutes, and stop-after.
Org-wide: gh aw env sets GH_AW_DEFAULT_* defaults that percolate (frontmatter → repo → org → enterprise), and GH_AW_POLICY_* variables gate capabilities without recompiling.
Tune budgets to the work, roll out defaults in layers, and remember budgets limit spend, not blast radius.
What's next. You can now build, secure, operate, and govern agentic workflows. The final chapter zooms all the way out: Chapter 14: Fleets & Adoption takes the Repo Assistant from one repo to a multi-repo fleet and lays out the enterprise adoption playbook.