Skip to content

Add Semantic Scholar article enrichment, citation graph, and recommendations#223

Merged
imaurer merged 5 commits intomainfrom
T016-biomcp
Mar 15, 2026
Merged

Add Semantic Scholar article enrichment, citation graph, and recommendations#223
imaurer merged 5 commits intomainfrom
T016-biomcp

Conversation

@imaurer
Copy link
Copy Markdown
Collaborator

@imaurer imaurer commented Mar 15, 2026

Summary

  • Adds optional Semantic Scholar integration gated on S2_API_KEY environment variable
  • When key is present, get article output includes a Semantic Scholar section with TLDR, influential citation count, reference count, and open-access PDF metadata
  • Three new article subcommands: citations, references, and recommendations — all require S2_API_KEY and exit with a clear error when absent
  • Multi-seed recommendations supported: article recommendations <id1> <id2> --negative <id3>
  • Rate limiting at 1 req/sec with batch lookup (up to 500 IDs) for bulk cross-referencing
  • Graceful degradation: get article works fully without the key; S2 section is simply omitted
  • Docs updated: docs/reference/data-sources.md, docs/getting-started/api-keys.md, and RUN.md explain the optional key and what it unlocks
  • scripts/contract-smoke.sh probes S2 when key is present; skips cleanly when absent

Changes

New source client: src/sources/semantic_scholar.rs — typed API client for citations, references, recommendations, and paper detail endpoints with 1 req/sec rate limit.

Article enrichment: src/entities/article.rsget article enriches the result with a semantic_scholar block when S2_API_KEY is set; silently skips otherwise.

New CLI commands: src/cli/mod.rsarticle citations, article references, article recommendations as explicit S2-only navigation commands.

Rendering: src/render/markdown.rs — citation/reference/recommendation tables with context snippets, intents, and influential flag.

Docs and smoke: RUN.md, docs/reference/data-sources.md, docs/getting-started/api-keys.md, scripts/contract-smoke.sh, and spec coverage for all new S2 paths.

Behavior without S2_API_KEY

$ biomcp get article 22663011          # works, no S2 section
$ biomcp article citations 22663011    # exits 1: "API key required: set S2_API_KEY"

Test plan

  • make check (clippy + fmt) passes
  • cargo test — 647 unit tests pass
  • make test-contracts — contract tests pass
  • spec/06-article.md spec — 16/16 pass including new S2 tests
  • Manual: get article <pmid> tldr shows Semantic Scholar section with key present
  • Manual: article citations <pmid> --limit 3 returns citation table with contexts
  • Manual: article references <pmid> --limit 3 returns reference table
  • Manual: article recommendations <pmid> --limit 3 returns recommendations
  • Manual: all three S2 commands exit 1 cleanly without key

imaurer added 5 commits March 15, 2026 11:07
S2 API uses `citedPaper.*` field prefix in the /references endpoint, but
the implementation used the wrong `referencedPaper.*` prefix, causing HTTP
400 errors on all article references requests. Fixed both the field
constants string and the serde rename on the struct field.
This heading has repeated live-network timeout history in GitHub Actions
(60s budget). Per the existing Makefile smoke-lane policy, move it to
the deselect list so provider latency does not block unrelated PRs.
@imaurer imaurer merged commit 6e3e201 into main Mar 15, 2026
5 checks passed
@imaurer imaurer deleted the T016-biomcp branch March 23, 2026 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant