test_skill

Active

Tool of IA-QA — 130+ QA & Dev Tools for AI Agents

declared in 1.0.0

Validate a SKILL.md definition (Cursor / GitHub Copilot / Windsurf) by auto-generating trigger-positive and trigger-negative scenarios, running each through the model with the skill injected as a system prompt, and scoring trigger accuracy + step adherence. Returns a PASS/FIX/BLOCK verdict with per-scenario breakdown. Uses Groq llama-3.3-70b by default (server key, no api_key needed). Pass api_key + model to use your own provider.

Parameters schema

{
  "type": "object",
  "required": [
    "skill_md"
  ],
  "properties": {
    "model": {
      "type": "string",
      "description": "LLM model ID to use for both scenario generation and testing (e.g. gpt-4o-mini, claude-3-5-haiku-20241022). Defaults to llama-3.3-70b-versatile (Groq, server key)."
    },
    "api_key": {
      "type": "string",
      "description": "API key for the chosen model provider. Not required when using the default Groq model."
    },
    "skill_md": {
      "type": "string",
      "description": "Full content of the SKILL.md file to test. Must include a name, a \"Use when:\" trigger description, and at least one step."
    },
    "scenario_count": {
      "enum": [
        4,
        6,
        8,
        10
      ],
      "type": "number",
      "description": "Number of test scenarios to generate: half trigger-positive, half trigger-negative. Default: 6."
    }
  }
}

What this tool wraps· 1 endpoint

min confidence0.70 0.50

Data resourceGroq API
Groq · ai
name mention0.65

Parent server

IA-QA — 130+ QA & Dev Tools for AI Agents

https://github.com/jcjamet/ia-qa

1/7 registries

View full server →