Prompt engineering for coding agents is different from general prompting. The agent has access to your files, can run commands, and maintains a conversation across multiple steps.
"Optimize the getArticles() query in src/db/queries.ts — it's doing N+1 queries. Use a
single JOIN instead."
"The auth middleware in src/middleware.ts doesn't handle expired tokens. When a token
expires, users see a 500 error instead of being redirected to login."
Let the agent decide HOW to fix it. You describe WHAT's wrong and WHY it matters.
Good constraints:
src/api/ files""Follow the same pattern as src/services/ingestion/rss.ts — parallel fetch with Promise.allSettled, error collection, typed return."
Force the model to return structured data:
const tool = {
name: 'classify_article',
input_schema: {
type: 'object',
properties: {
category: { type: 'string', enum: ['bug', 'feature', 'refactor'] },
priority: { type: 'integer', minimum: 1, maximum: 5 },
},
required: ['category', 'priority'],
},
};This gives less than 0.2% parse failure rate vs 8-15% with raw JSON prompting.
Include 2-3 examples in the system prompt to calibrate the model's output:
Example 1:
Input: "Minor typo fix in README"
Output: { "category": "bug", "priority": 1 }
Example 2:
Input: "Add OAuth2 authentication"
Output: { "category": "feature", "priority": 4 }Force reasoning before scoring to prevent lazy clustering:
"First explain your reasoning for each score dimension, then provide the numerical scores."
This prevents the model from defaulting to 7/10 for everything.
Claude is trained to pay attention to XML-like tags in prompts. Use them to separate instructions, examples, and context so the model does not confuse them:
<instructions>
Classify each article into one category and rate it 1–10 on novelty.
</instructions>
<context>
Tier-1 topics: Claude Code, MCP, Cursor, Agent SDK.
</context>
<examples>
<example>
<input>Anthropic ships Claude 5 Opus</input>
<output>{ "category": "New Models & Releases", "novelty": 9 }</output>
</example>
</examples>
<article>
{{ article text }}
</article>Tags help with three things: (1) the model treats each section with appropriate weight, (2) you can cache the stable blocks (<instructions>, <context>, <examples>) while only the <article> changes per call, (3) future-you can find the block to edit without reading a wall of prose.
Tag names are arbitrary — pick names that read naturally. Anthropic's prompt-engineering guide uses <instructions>, <example>, <context>, <formatting>, <thinking> as conventions.
Search for a command to run...