Helium MCP server scores 3.2M+ news articles for AI authorship across 31 dimensions
Developer connerlambden built Helium, a free MCP server that scores over 3.2M articles across 5,000+ sources on `ai_authorship_probability` and 30 other framing dimensions, queryable in plain English from Claude or Cursor.
Score breakdown
Practitioners building AI-news workflows or fact-checking pipelines can now query a pre-scored, 31-dimension corpus of millions of articles in plain English via Claude or Cursor — without writing scrapers, classifiers, or SQL.
- 01Helium MCP server scores 3.2M+ articles from 5,000+ sources across 31 framing dimensions.
- 02The primary metric is `ai_authorship_probability`; other dimensions include `credibility`, `sensationalism`, `overconfidence`, `opinion_vs_fact`, `oversimplification`, `begging_the_question`, `scapegoating`, and `covering_responses`.
- 03Accessible via `npx mcp-remote https://heliumtrades.com/mcp` — free, no signup, no API key required.
Connerlambden's post introduces Helium, a free remote MCP server that continuously scores a corpus of 3.2M+ articles from 5,000+ news sources across 31 framing dimensions. The flagship metric is `ai_authorship_probability` — an explicit model estimate that an article was LLM-generated — but it is designed to be triangulated against dimensions like `credibility` (sourcing density, named-source ratio, evidence-citation patterns), `sensationalism` (headline-vs-body amplification, superlative density), `overconfidence` (hedge-language vs declarative-certainty ratio), `opinion_vs_fact`, `oversimplification`, `begging_the_question`, `scapegoating`, and `covering_responses`, among 22 others. The server is accessible via `npx mcp-remote https://heliumtrades.com/mcp` and requires no signup or API key.
Conversely, academic-style human prose is often falsely flagged as AI-generated by single-axis tools, but sourcing-density and citation-evidence scores distinguish a grad student from an LLM.
The post argues that single-number detectors fail in two distinct ways: human editors can smooth LLM drafts enough to drop a binary detector below threshold, but framing artifacts survive — making multi-dimensional scoring necessary to catch human-edited AI drafts. Conversely, academic-style human prose is often falsely flagged as AI-generated by single-axis tools, but sourcing-density and citation-evidence scores distinguish a grad student from an LLM. A real query run in Claude surfaced a cohort of high-`ai_authorship_probability`, high-`credibility`, moderate-`sensationalism` articles — identified as likely human-edited AI drafts that a single-axis detector would miss entirely.
Connerlambden outlines several practitioner use cases: newsroom standards editors running daily cron jobs to flag high-AI-authorship freelance submissions, fact-checkers triaging viral claims by pulling a source's recent score drift, journalism instructors assigning trend-line analysis across a publication's decade of output, and AI-safety researchers using the full 31-dimension scored corpus as a dataset for studying LLM content spread. A sample query tracking `ai_authorship_probability` for tech-news sources over the past year revealed that several mid-tier aggregator sources showed a meaningful upward shift in AI authorship while their `credibility` score drifted down.
Key facts
- 01Helium MCP server scores 3.2M+ articles from 5,000+ sources across 31 framing dimensions.
- 02The primary metric is `ai_authorship_probability`; other dimensions include `credibility`, `sensationalism`, `overconfidence`, `opinion_vs_fact`, `oversimplification`, `begging_the_question`, `scapegoating`, and `covering_responses`.