You're viewing a demo portfolio

Join the waitlist
PRSM

run_eval_contract

Active

Tool of IA-QA — 130+ QA & Dev Tools for AI Agents

declared in 1.0.0

Parse a .ia-eval.yaml LLM test suite, call the specified LLM model for each scenario, run all configured scorers, and return a structured JSON report with per-scenario Pass/Fail verdicts and a Markdown summary. Use list_local_tests to discover available test files.

Parameters schema

{
  "type": "object",
  "properties": {
    "api_keys": {
      "type": "object",
      "properties": {
        "hf": {
          "type": "string"
        },
        "groq": {
          "type": "string"
        },
        "google": {
          "type": "string"
        },
        "openai": {
          "type": "string"
        },
        "anthropic": {
          "type": "string"
        }
      },
      "description": "API keys to use for LLM generation (all optional — falls back to server env vars)"
    },
    "overrides": {
      "type": "object",
      "properties": {
        "model": {
          "type": "string"
        },
        "provider": {
          "type": "string"
        },
        "temperature": {
          "type": "number"
        },
        "system_prompt": {
          "type": "string"
        }
      },
      "description": "Override contract defaults"
    },
    "contract_path": {
      "type": "string",
      "description": "Absolute or relative path to a .ia-eval.yaml file (required unless inline_contract is provided)"
    },
    "inline_contract": {
      "type": "object",
      "description": "Raw contract object (alternative to contract_path)",
      "additionalProperties": true
    }
  }
}

What this tool wraps· 0 endpoints

min confidence0.700.50

No endpoints wrapped at confidence ≥ 0.50.

Parent server

IA-QA — 130+ QA & Dev Tools for AI Agents

https://github.com/jcjamet/ia-qa

1/7 registries
View full server →
run_eval_contract — IA-QA — 130+ QA & Dev Tools for AI Agents — PRSM MCP