Study maps 2,430 Claude Code tool picks across 20 categories
A systematic survey by Edwin Ong & Alex Vikati of amplifying.ai tracked 2,430 open-ended prompts to Claude Code across 3 models, 4 project types, and 20 tool categories, revealing that the agent's choices are converging into a de facto default stack.
Score breakdown
Tool vendors and developers should audit whether their preferred libraries appear in Claude Code's default stack, since the agent installs and commits code autonomously — meaning its training-data biases now directly influence which packages ship in new projects.
- 01Study issued 2,430 open-ended prompts to Claude Code CLI v2.1.39 across 3 models (Sonnet 4.5, Opus 4.5, Opus 4.6), 4 project types, and 20 tool categories
- 02Custom/DIY builds accounted for 252 of 2,073 primary picks (12%), making it the single most common 'recommendation'
- 03GitHub Actions captured 94% of CI/CD picks, shadcn/ui 90% of UI component picks, and Stripe 91% of payment picks
Researchers Edwin Ong & Alex Vikati at amplifying.ai conducted a systematic benchmark of Claude Code's tool selection behavior, issuing 2,430 open-ended prompts — such as "what should I use?" and "add user authentication" — against four greenfield repositories: a Next.js 14/TypeScript SaaS app (TaskFlow), a FastAPI/Python 3.11 API (DataPipeline), a Vite/React 18 SPA (InvoiceTracker), and a Node.js/TypeScript CLI (deployctl). Three models were tested — Sonnet 4.5, Opus 4.5, and Opus 4.6 — with three independent runs per model-repo combination, using a full `git checkout . && git clean -fd` between every prompt to ensure clean state. An LLM-based extraction pipeline (Claude Code subagents) identified the primary tool pick from each response, achieving an 85.3% extraction rate yielding 2,073 usable picks.
Several categories show near-monopoly behavior — GitHub Actions at 94% for CI/CD, shadcn/ui at 90% for UI components, and Stripe at 91% for payments.
The most striking finding is that Claude Code builds rather than buys: custom/DIY implementations accounted for 252 out of 2,073 primary picks (12%), making it the single most common "recommendation" across all 20 categories. Where the agent does reach for third-party tools, it converges sharply on a consistent default stack: Vercel, PostgreSQL, Stripe, Tailwind CSS, shadcn/ui, pnpm, GitHub Actions, Sentry, Resend, and Zustand, with stack-specific picks like Drizzle (JS) or SQLModel (Python) for ORMs, NextAuth.js for Next.js auth, and Vitest (JS) or pytest (Python) for testing. Several categories show near-monopoly behavior — GitHub Actions at 94% for CI/CD, shadcn/ui at 90% for UI components, and Stripe at 91% for payments. Notably, Redux and Express received zero primary picks across the entire study.
The research also examined consistency dimensions: all three models agreed on the top tool in 18 of 20 categories within the same ecosystem, with only Caching and Real-time showing genuine cross-ecosystem disagreement. Prompt phrasing had limited impact, with 76% average stability across 5 phrasings of the same category, while project context mattered more — the same category yielded different picks across repos (e.g., Vercel for Next.js vs. Railway for Python). The authors frame Claude Code as a new distribution channel where a model's training data may shape market share more than marketing budgets, making agent tool-selection behavior a form of competitive intelligence for vendors and developers alike.
Key facts
- 01Study issued 2,430 open-ended prompts to Claude Code CLI v2.1.39 across 3 models (Sonnet 4.5, Opus 4.5, Opus 4.6), 4 project types, and 20 tool categories
- 02Custom/DIY builds accounted for 252 of 2,073 primary picks (12%), making it the single most common 'recommendation'