openalex_search_entities
ActiveTool of @cyanheads/openalex-mcp-server
Search, filter, sort, or retrieve by ID. Covers all OpenAlex entity types (works, authors, sources, institutions, topics, keywords, publishers, funders). Pass `id` to retrieve a single entity. Otherwise, use `query` and/or `filters` for discovery. Supports keyword search with boolean operators, exact phrase matching, and AI semantic search. Use openalex_resolve_name to resolve names to IDs before filtering. Searches and ID lookups return a curated set of fields by default; pass `select` to override with specific fields, or `["*"]` for the full record.
Parameters schema
{
"type": "object",
"$schema": "http://json-schema.org/draft-07/schema#",
"required": [
"entity_type"
],
"properties": {
"id": {
"type": "string",
"description": "Retrieve a single entity by ID. Supports: OpenAlex ID (\"W2741809807\"), DOI (\"10.1038/nature12373\"), ORCID (\"0000-0002-1825-0097\"), ROR (\"https://ror.org/00hx57361\"), PMID (\"12345678\"), PMCID (\"PMC1234567\"), ISSN (\"1234-5678\"). When provided, other search/filter/sort params are ignored — but `select` still applies: the curated per-entity-type default is returned unless you pass `select` (use `[\"*\"]` for the complete record). Use openalex_resolve_name to find the ID if unknown."
},
"seed": {
"type": "string",
"description": "Deterministic seed for `sample`. Same seed + same filters = same results — pass when reproducibility matters. Has no effect (and is rejected) without `sample`."
},
"sort": {
"type": "string",
"description": "Sort field. Prefix with \"-\" for descending. Common: \"cited_by_count\", \"-publication_date\", \"-relevance_score\" (default when query present). Note: when combined with a keyword query, an explicit sort overrides relevance ranking entirely — top results may be highly cited but only tangentially on-topic. Use \"-relevance_score\" or omit sort to keep the most relevant results first. \"-relevance_score\" requires an active search via \"query\" or a \"filter:search\" filter — passing it without one will fail."
},
"query": {
"type": "string",
"description": "Text search query. Supports boolean operators (AND, OR, NOT), quoted phrases (\"exact match\"), wildcards (machin*), fuzzy matching (machin~1), and proximity (\"climate change\"~5). Omit for filter-only queries."
},
"cursor": {
"type": "string",
"description": "Pagination cursor from a previous response. Pass to get the next page."
},
"sample": {
"type": "integer",
"maximum": 100,
"minimum": 1,
"description": "Return a random sample of this many entities matching the filters (1-100). Single page only — pagination via `cursor` is not supported with sampling. Overrides `per_page`. Useful for unbiased exploration: spot-checking filter correctness, stratified review prompts, or generating exploration sets without bias toward most-cited."
},
"select": {
"type": "array",
"items": {
"type": "string"
},
"description": "OpenAlex top-level field names to return. Always returned: `id`, `display_name` — additional fields you list are appended. A curated default per entity type applies to both searches and single-entity (`id`) lookups; pass field names to override it, or `[\"*\"]` to retrieve the complete record (every field). Invalid field names produce an error identifying the rejected field. Example: [\"doi\", \"authorships\", \"primary_topic\"]."
},
"filters": {
"type": "object",
"description": "Filter criteria as field:value pairs. AND across fields (multiple keys). OR within field: pipe-separate (\"us|gb\"). NOT: prefix \"!\" (\"!us\"). Range: \"2020-2024\". Comparison: \">100\", \"<50\". AND within same field: \"+\"-separate. Use OpenAlex IDs (not names) for entity filters — resolve names first. Common keys: `openalex` (filter by entity ID, e.g. {\"openalex\": \"W123|W456\"}), `cites` (works citing a given work), `publication_year` (range \"2020-2024\"), `authorships.author.id`, `type`, `is_oa`.",
"propertyNames": {
"type": "string"
},
"additionalProperties": {
"type": "string"
}
},
"per_page": {
"type": "integer",
"default": 25,
"maximum": 100,
"minimum": 1,
"description": "Results per page (1-100). Default 25. Semantic search caps at 50 — when search_mode=\"semantic\", set per_page ≤ 50 (also subject to a 1 req/sec rate limit upstream)."
},
"entity_type": {
"enum": [
"works",
"authors",
"sources",
"institutions",
"topics",
"keywords",
"publishers",
"funders"
],
"type": "string",
"description": "Type of scholarly entity to search."
},
"search_mode": {
"enum": [
"keyword",
"exact",
"semantic"
],
"type": "string",
"default": "keyword",
"description": "Search strategy. \"keyword\": stemmed full-text (default). \"exact\": no stemming, matches individual words (use quoted phrases for multi-word exact match). \"semantic\": AI embedding similarity (max 50 results, 1 req/sec)."
}
}
}Parent server
@cyanheads/openalex-mcp-server
https://github.com/cyanheads/openalex-mcp-server
2/7 registries