similarity_score

Active

Tool of IA-QA — 130+ QA & Dev Tools for AI Agents

declared in 1.0.0

Compute text similarity between reference and hypothesis using multiple metrics: Cosine (BoW, TF-IDF), Jaccard, ROUGE-1, ROUGE-2, ROUGE-L, and BLEU. No API key needed. Ideal for LLM eval (expected vs actual), RAG quality checks, and NLG benchmarking. Supports batch mode.

Parameters schema

{
  "type": "object",
  "properties": {
    "batch": {
      "type": "array",
      "items": {
        "type": "object",
        "required": [
          "reference",
          "hypothesis"
        ],
        "properties": {
          "reference": {
            "type": "string"
          },
          "hypothesis": {
            "type": "string"
          }
        }
      },
      "description": "Batch mode: array of {reference, hypothesis} pairs."
    },
    "metrics": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Metrics to compute (default: all). Options: \"cosine_bow\", \"cosine_tfidf\", \"jaccard\", \"rouge1\", \"rouge2\", \"rougeL\", \"bleu\""
    },
    "reference": {
      "type": "string",
      "description": "Reference / expected text (ground truth)"
    },
    "threshold": {
      "type": "number",
      "description": "Optional pass/fail threshold (0-1). Applies to ROUGE-L F1 score."
    },
    "hypothesis": {
      "type": "string",
      "description": "Hypothesis / actual text (LLM output)"
    }
  }
}

What this tool wraps· 0 endpoints

min confidence0.70 0.50

No endpoints wrapped at confidence ≥ 0.50.

Parent server

IA-QA — 130+ QA & Dev Tools for AI Agents

https://github.com/jcjamet/ia-qa

1/7 registries

View full server →