You're viewing a demo portfolio

Join the waitlist
PRSM

generate_eval_yaml

Active

Tool of IA-QA — 130+ QA & Dev Tools for AI Agents

declared in 1.0.0

Generate a complete .ia-eval.yaml evaluation contract from a plain-language description of what your LLM should do. Uses Groq llama-3.3-70b (server-side, no API key needed). Returns ready-to-run YAML for the LLM Test Runner (run_eval_contract). Picks appropriate evaluators (cosine_similarity, contains_check, hallucination_check, etc.) based on the task type.

Parameters schema

{
  "type": "object",
  "required": [
    "description"
  ],
  "properties": {
    "task_type": {
      "enum": [
        "rag",
        "summarization",
        "classification",
        "safety",
        "customer_support",
        "code_gen"
      ],
      "type": "string",
      "description": "Optional task type hint to guide evaluator selection."
    },
    "description": {
      "type": "string",
      "description": "Plain-language description of what the LLM under test should do. Be specific: describe inputs, expected behaviour, and constraints."
    },
    "system_prompt": {
      "type": "string",
      "description": "Optional system prompt of the LLM under test. Helps generate more accurate test cases."
    },
    "scenario_count": {
      "enum": [
        3,
        5,
        8
      ],
      "type": "number",
      "description": "Number of scenarios to generate (default: 5). Covers happy path + edge cases + adversarial."
    }
  }
}

What this tool wraps· 1 endpoint

min confidence0.700.50

Parent server

IA-QA — 130+ QA & Dev Tools for AI Agents

https://github.com/jcjamet/ia-qa

1/7 registries
View full server →