API Ingest brings deterministic search to OpenAPI docs via MCP
Developer mohidbutt built `api-ingest`, a Python tool that parses OpenAPI/Swagger specs into indexed chunks and a holistic manifest, exposing them to AI agents via an MCP server for precise, deterministic API doc search.
Score breakdown
Developers building agents that call third-party APIs can use `api-ingest` to replace imprecise semantic doc search with structured, deterministic OpenAPI-spec lookups over MCP, potentially reducing hallucinated arguments and bad requests.
- 01mohidbutt built `api-ingest`, a Python script that converts API spec files into agent-accessible indexed chunks and a holistic manifest.
- 02The tool supports OpenAPI, Swagger, JSON, YAML, RAML, and similar spec formats.
- 03Agents access the parsed docs via an MCP server for deterministic, structured search.
mohidbutt's `api-ingest` project targets a specific and well-documented frustration: AI agents consistently struggle with API documentation, inventing arguments, misreading required types, and omitting fields — leading to cascading bad requests. Even when agents fall back to web search or tools like Context7, the results are often fuzzy and imprecise, burning large amounts of tokens without resolving the underlying issue.
These chunks are made accessible to agents through an MCP server, enabling what the author calls "agentic search" — structured, deterministic lookup rather than fuzzy semantic retrieval.
The tool works by converting local or community-provided API spec files (OpenAPI, Swagger, JSON, YAML, RAML, and similar formats) into two structured outputs: a high-level "manifest" that gives the agent a holistic understanding of the API, and a set of indexed markdown chunks organized by endpoints, tags, and schemas. These chunks are made accessible to agents through an MCP server, enabling what the author calls "agentic search" — structured, deterministic lookup rather than fuzzy semantic retrieval.
The author acknowledges limitations, particularly when individual endpoint chunks are themselves too large to fit fully into context. They also note that many AI-native companies already provide agent-optimized formats like `llms.txt`, but argue the gap remains significant for APIs that lack such affordances — citing the Semantic Scholar Graph API as a concrete example. The project is open source and the author is exploring adding quantitative evals to validate the approach.
Key facts
- 01mohidbutt built `api-ingest`, a Python script that converts API spec files into agent-accessible indexed chunks and a holistic manifest.
- 02The tool supports OpenAPI, Swagger, JSON, YAML, RAML, and similar spec formats.
- 03Agents access the parsed docs via an MCP server for deterministic, structured search.
- 04The author argues semantic search tools like Context7 lack the precision needed for correct API calls.
- 05Specs are split into two artifacts: a 'manifest' overview and markdown chunks indexed by endpoints, tags, and schemas.
- 06A known limitation is that individual endpoint chunks can be too large to fully load into context.
- 07The author is considering adding evals to quantitatively benchmark the approach.