CI is where agents scale from "neat assistant" to "automated teammate". This page covers the concrete patterns: non-interactive invocation, PR review bots, and the guardrails that keep the bills manageable.
Most agents have a -p / --prompt / --non-interactive flag that runs a single prompt and exits with the result on stdout:
# Claude Code
claude -p "Summarize changes in src/ since last week" --output-format json
# Codex CLI
codex "add a return type to getUserById in src/db/users.ts" --non-interactive
# Aider
aider --message "Fix the type error in src/pipeline.ts and commit" --yesThese compose with shell pipelines — feed the output to jq, grep for a pass/fail string, fail the job on error.
A GitHub Actions workflow triggered on pull_request that runs an agent to review the diff and comment.
# .github/workflows/ai-review.yml
name: AI Review
on:
pull_request:
branches: [main]
permissions:
contents: read
pull-requests: write
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
DIFF=$(git diff origin/main...HEAD)
claude -p "Review this diff for bugs, security issues, missing tests. Be specific and concise: $DIFF" \
--allowed-tools "" \
--output-format text > review.md
- name: Post comment
uses: marocchino/sticky-pull-request-comment@v2
with:
path: review.mdVariants:
--allowed-tools "" to keep it read-onlyNightly jobs that keep the codebase tidy:
# .github/workflows/nightly-tidy.yml
name: Nightly tidy
on:
schedule: [{ cron: '0 3 * * *' }]
workflow_dispatch:
jobs:
tidy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Dependency patch bumps
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
claude -p "Check package.json for patch-version bumps safe to apply. Update pnpm-lock.yaml. If tests pass, commit with message 'chore: patch bumps'. If not, abort." \
--permission-mode auto \
--allowed-tools "Read,Edit,Bash(pnpm *)"
- name: Push and open PR
run: gh pr create --title "Nightly patch bumps" --body "Automated"Use this pattern for: patch bumps, typo fixes, lint cleanup, doc freshness checks.
The canonical multi-stage pipeline in CI — what this very project runs:
Cron (6 AM UTC)
↓
Ingest (RSS / HN / GitHub / Reddit / ArXiv)
↓
Filter (keywords → embeddings → LLM)
↓
Categorize + Summarize (Anthropic Batch API)
↓
Deliver (Telegram / Slack / RSS / Email)
↓
ISR revalidate (Vercel webhook)Runs on GitHub Actions (not Vercel — pipelines > 60s exceed Vercel function limits). See Orchestration Patterns and architecture notes.
CI can accidentally burn money. Two safety nets:
# Track tokens as you go; abort if over budget
MAX_COST_USD=2.00
claude -p "..." --output-format json > response.json
COST=$(jq '.total_cost_usd' response.json)
python -c "import sys; sys.exit(1 if float('$COST') > $MAX_COST_USD else 0)"Anthropic Console, OpenAI Dashboard, and most providers let you set a monthly spend cap that throttles or denies requests past the limit. Set these before going to production — they're the backstop.
# Runs daily, alerts if MTD spend is tracking too high
- name: Budget check
run: node scripts/check-llm-budget.mjs --mtd-limit 50env: in workflow yaml — GitHub masks these automatically in logstimeout 10m claude -p "...".--yes / --bypass modes in CI. Necessary (no interactive approval available) but risky — always pair with tight --allowed-tools or a sandbox runner.jobs.<name>.strategy.max-parallel.--allowed-tools flag maps to .claude/settings.json rulesSearch for a command to run...