Smart Semantic Scholar MCP server handles rate limits for AI research agents
u/spideryzarc released an open-source MCP server for Semantic Scholar that uses a discovery-first workflow, SQLite caching, and async rate limiting to help AI agents conduct academic literature reviews without hitting API limits or wasting context tokens.
Score breakdown
The server addresses two concrete pain points for AI research agents — hitting Semantic Scholar's strict rate limits and exhausting context windows — by combining a discovery-first retrieval strategy with local caching and resilient concurrency controls.
- 01Built by u/spideryzarc and published as open source on GitHub at spideryzarc/smart-semantic-scholar-mcp
- 02Uses a discovery-first workflow: agents fetch lightweight metadata (titles, citation counts, IDs) before retrieving full abstracts or TLDRs
- 03Persistent SQLite cache in WAL mode stores all queries locally and merges new metadata into existing records
u/spideryzarc released `smart-semantic-scholar-mcp`, an open-source MCP server that connects AI agents to the Semantic Scholar academic database while managing the API's strict rate limits and minimizing context token consumption. The server's core design philosophy is a discovery-first workflow: agents are guided to first retrieve lightweight metadata such as titles, citation counts, and paper IDs, and only subsequently fetch full abstracts or TLDRs for papers of interest. All query results are stored in a persistent local SQLite cache running in WAL mode, which automatically merges new metadata attributes into existing records to avoid redundant API calls.
On the reliability side, the server implements async rate limiters and semaphores with exponential backoff to handle Semantic Scholar's API constraints gracefully.
On the reliability side, the server implements async rate limiters and semaphores with exponential backoff to handle Semantic Scholar's API constraints gracefully. Its smart PDF downloader automatically retrieves Open Access PDFs from sources including arXiv, bioRxiv, and OpenReview; if a paper is paywalled or otherwise protected, the downloader halts immediately and returns manual download links rather than consuming agent time on a dead end. A BibTeX sync feature rounds out the toolset, generating citations and syncing existing `.bib` files using DOIs or title search as fallbacks.
Key facts
- 01Built by u/spideryzarc and published as open source on GitHub at spideryzarc/smart-semantic-scholar-mcp
- 02Uses a discovery-first workflow: agents fetch lightweight metadata (titles, citation counts, IDs) before retrieving full abstracts or TLDRs
- 03Persistent SQLite cache in WAL mode stores all queries locally and merges new metadata into existing records
- 04Async rate limiters and semaphores with exponential backoff handle Semantic Scholar's strict API limits
- 05Smart PDF downloader retrieves Open Access PDFs from arXiv, bioRxiv, OpenReview, and similar sources
- 06If a paper is paywalled, the downloader halts and provides manual download links instead of wasting agent time
- 07BibTeX sync feature generates citations and syncs existing .bib files using DOIs or title search fallbacks
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 17:05 UTC. How this works →