You're viewing a demo portfolio
Explore
MCP EcosystemWant to try it with your own data?
Add up to 10,000 DOIs or GitHub URLs at a time, bringing them and their related entities into the graph. Build portfolios and run queries across them.
Join the waitlistPRSM is built and maintained by one person. The platform works today, but opening it up will be slow and deliberate. If you want to help shape this infrastructure, or speed things up, let's talk.
io.github.JcJamet/ia-qa-toolbox
130+ QA & dev tools for AI agents: prompt injection, RAG testing, VLM eval, guardrails. Free.
biocontext
Not listed
glama
Not listed
mcp.so
Not listed
nerq
Not listed
pulsemcp
Not listed
smithery
Not listed
Generate an A/B test report comparing two prompts or model configurations. Accepts arrays of scores and returns statistical comparison: mean, median, std deviation, winner, and improvement percentage.
Detect potential bugs and code smells from a git diff or two code versions. Returns a list of issues with severity levels and test suggestions.
Semantically analyze N already-produced model outputs for the SAME task (the MCP counterpart to the LLM Sandbox). Without a reference: computes consensus — pairwise cosine agreement, the most-representative output, and the outlier. With a `reference` (ground truth): also ranks every output by closeness (token cosine + ROUGE-L composite) and names the closest. Deterministic, no LLM, no key — gate-able in CI. You bring the outputs (2+). For a 2-way head-to-head with structural JSON diff use compare_responses instead.
No maintainers met the attribution threshold for this repository.
Qdrant