You're viewing a demo portfolio

Join the waitlist
PRSM

openalex_search_entities

Active

Tool of @cyanheads/openalex-mcp-server

declared in 0.7.2

Search, filter, sort, or retrieve by ID. Covers all OpenAlex entity types (works, authors, sources, institutions, topics, keywords, publishers, funders). Pass `id` to retrieve a single entity. Otherwise, use `query` and/or `filters` for discovery. Supports keyword search with boolean operators, exact phrase matching, and AI semantic search. Use openalex_resolve_name to resolve names to IDs before filtering. Searches and ID lookups return a curated set of fields by default; pass `select` to override with specific fields, or `["*"]` for the full record.

Parameters schema

{
  "type": "object",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "required": [
    "entity_type"
  ],
  "properties": {
    "id": {
      "type": "string",
      "description": "Retrieve a single entity by ID. Supports: OpenAlex ID (\"W2741809807\"), DOI (\"10.1038/nature12373\"), ORCID (\"0000-0002-1825-0097\"), ROR (\"https://ror.org/00hx57361\"), PMID (\"12345678\"), PMCID (\"PMC1234567\"), ISSN (\"1234-5678\"). When provided, other search/filter/sort params are ignored — but `select` still applies: the curated per-entity-type default is returned unless you pass `select` (use `[\"*\"]` for the complete record). Use openalex_resolve_name to find the ID if unknown."
    },
    "seed": {
      "type": "string",
      "description": "Deterministic seed for `sample`. Same seed + same filters = same results — pass when reproducibility matters. Has no effect (and is rejected) without `sample`."
    },
    "sort": {
      "type": "string",
      "description": "Sort field. Prefix with \"-\" for descending. Common: \"cited_by_count\", \"-publication_date\", \"-relevance_score\" (default when query present). Note: when combined with a keyword query, an explicit sort overrides relevance ranking entirely — top results may be highly cited but only tangentially on-topic. Use \"-relevance_score\" or omit sort to keep the most relevant results first. \"-relevance_score\" requires an active search via \"query\" or a \"filter:search\" filter — passing it without one will fail."
    },
    "query": {
      "type": "string",
      "description": "Text search query. Supports boolean operators (AND, OR, NOT), quoted phrases (\"exact match\"), wildcards (machin*), fuzzy matching (machin~1), and proximity (\"climate change\"~5). Omit for filter-only queries."
    },
    "cursor": {
      "type": "string",
      "description": "Pagination cursor from a previous response. Pass to get the next page."
    },
    "sample": {
      "type": "integer",
      "maximum": 100,
      "minimum": 1,
      "description": "Return a random sample of this many entities matching the filters (1-100). Single page only — pagination via `cursor` is not supported with sampling. Overrides `per_page`. Useful for unbiased exploration: spot-checking filter correctness, stratified review prompts, or generating exploration sets without bias toward most-cited."
    },
    "select": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "OpenAlex top-level field names to return. Always returned: `id`, `display_name` — additional fields you list are appended. A curated default per entity type applies to both searches and single-entity (`id`) lookups; pass field names to override it, or `[\"*\"]` to retrieve the complete record (every field). Invalid field names produce an error identifying the rejected field. Example: [\"doi\", \"authorships\", \"primary_topic\"]."
    },
    "filters": {
      "type": "object",
      "description": "Filter criteria as field:value pairs. AND across fields (multiple keys). OR within field: pipe-separate (\"us|gb\"). NOT: prefix \"!\" (\"!us\"). Range: \"2020-2024\". Comparison: \">100\", \"<50\". AND within same field: \"+\"-separate. Use OpenAlex IDs (not names) for entity filters — resolve names first. Common keys: `openalex` (filter by entity ID, e.g. {\"openalex\": \"W123|W456\"}), `cites` (works citing a given work), `publication_year` (range \"2020-2024\"), `authorships.author.id`, `type`, `is_oa`.",
      "propertyNames": {
        "type": "string"
      },
      "additionalProperties": {
        "type": "string"
      }
    },
    "per_page": {
      "type": "integer",
      "default": 25,
      "maximum": 100,
      "minimum": 1,
      "description": "Results per page (1-100). Default 25. Semantic search caps at 50 — when search_mode=\"semantic\", set per_page ≤ 50 (also subject to a 1 req/sec rate limit upstream)."
    },
    "entity_type": {
      "enum": [
        "works",
        "authors",
        "sources",
        "institutions",
        "topics",
        "keywords",
        "publishers",
        "funders"
      ],
      "type": "string",
      "description": "Type of scholarly entity to search."
    },
    "search_mode": {
      "enum": [
        "keyword",
        "exact",
        "semantic"
      ],
      "type": "string",
      "default": "keyword",
      "description": "Search strategy. \"keyword\": stemmed full-text (default). \"exact\": no stemming, matches individual words (use quoted phrases for multi-word exact match). \"semantic\": AI embedding similarity (max 50 results, 1 req/sec)."
    }
  }
}

What this tool wraps· 1 endpoint

min confidence0.700.50

Parent server

@cyanheads/openalex-mcp-server

https://github.com/cyanheads/openalex-mcp-server

2/7 registries
View full server →