fillin_answer

Active

declared in 1.27.0

Synthesized post-cutoff answer with inline citations. Use this when your model is small / cheap / weaker at tool-result synthesis (Llama, Gemini Flash, Mistral, Nemotron, Qwen). Fillin runs a server-side LLM pass over the retrieved post-cutoff documents and returns a 150-250 word answer with [title](url) citations already embedded — you can quote it directly. Premium models (Opus, Sonnet, GPT-4o) usually get better results from `fillin_query` and synthesizing themselves, but this tool works for any caller. Costs more than fillin_query because of the synthesis pass. Returns: A dict with: - answer: the synthesized paragraph (str | None) - citations: list of {title, url} extracted from the answer - corpus_match: "strong" | "weak" | "none" — quality of retrieval - top_score: float — top reranked similarity score - model: the synthesizer model used (e.g. claude-haiku-4-5) - reason: set when answer is None (e.g. "no_relevant_docs") - results: raw post-cutoff documents (same shape as fillin_query) - cutoff, query, gap_days: echoes for context

Parameters schema

{
  "type": "object",
  "title": "fillin_answerArguments",
  "required": [
    "query",
    "cutoff"
  ],
  "properties": {
    "k": {
      "type": "integer",
      "title": "K",
      "default": 5,
      "maximum": 20,
      "minimum": 1,
      "description": "Number of documents to ground the answer in (1-20)."
    },
    "query": {
      "type": "string",
      "title": "Query",
      "maxLength": 512,
      "minLength": 1,
      "description": "Natural-language question, max 512 chars."
    },
    "cutoff": {
      "type": "string",
      "title": "Cutoff",
      "pattern": "^\\d{4}-\\d{2}-\\d{2}(T.*)?$",
      "description": "Training cutoff as ISO-8601 date (e.g. 2026-01-01)."
    }
  }
}

What this tool wraps· 1 endpoint

min confidence0.70 0.50

Data resourceMistral AI API
Mistral AI · ai
name mention0.60

Parent server

Fillin

1/7 registries

View full server →