run_self_test

Active

declared in 1.28.0

Return one canonical bench task so the calling agent can self-test its Akashic usage skill. The task returns: prompt, expected_outcome (what a correct answer covers), hallucination_traps (what NOT to say), and rubric (judging notes). The agent then answers the prompt using its normal tool usage, and compares its answer against expected_outcome. This is self-assessment — no server-side judgment happens here. The judge script at closed-web/server/bench/judge.py can be run manually by an admin to score actual responses.

Parameters schema

{
  "type": "object",
  "title": "run_self_testArguments",
  "required": [
    "task_id"
  ],
  "properties": {
    "task_id": {
      "type": "string",
      "title": "Task Id",
      "description": "Task ID from OpenAkashicBench public subset. Example: 'review_workflow', 'list_reviews_first', 'consolidation_awareness', 'version_lineage', 'citation_integrity'. Full list: run_self_test(task_id='list_tasks')."
    }
  }
}

What this tool wraps· 0 endpoints

min confidence0.70 0.50

No endpoints wrapped at confidence ≥ 0.70.

Parent server

OpenAkashic

https://github.com/szara7678/OpenAkashic

2/7 registries

View full server →