fillin_answer
ActiveTool of Fillin
Synthesized post-cutoff answer with inline citations. Use this when your model is small / cheap / weaker at tool-result synthesis (Llama, Gemini Flash, Mistral, Nemotron, Qwen). Fillin runs a server-side LLM pass over the retrieved post-cutoff documents and returns a 150-250 word answer with [title](url) citations already embedded — you can quote it directly. Premium models (Opus, Sonnet, GPT-4o) usually get better results from `fillin_query` and synthesizing themselves, but this tool works for any caller. Costs more than fillin_query because of the synthesis pass. Returns: A dict with: - answer: the synthesized paragraph (str | None) - citations: list of {title, url} extracted from the answer - corpus_match: "strong" | "weak" | "none" — quality of retrieval - top_score: float — top reranked similarity score - model: the synthesizer model used (e.g. claude-haiku-4-5) - reason: set when answer is None (e.g. "no_relevant_docs") - results: raw post-cutoff documents (same shape as fillin_query) - cutoff, query, gap_days: echoes for context
Parameters schema
{
"type": "object",
"title": "fillin_answerArguments",
"required": [
"query",
"cutoff"
],
"properties": {
"k": {
"type": "integer",
"title": "K",
"default": 5,
"maximum": 20,
"minimum": 1,
"description": "Number of documents to ground the answer in (1-20)."
},
"query": {
"type": "string",
"title": "Query",
"maxLength": 512,
"minLength": 1,
"description": "Natural-language question, max 512 chars."
},
"cutoff": {
"type": "string",
"title": "Cutoff",
"pattern": "^\\d{4}-\\d{2}-\\d{2}(T.*)?$",
"description": "Training cutoff as ISO-8601 date (e.g. 2026-01-01)."
}
}
}Parent server
Fillin
1/7 registries