content-moderation
ActiveTool of The Stall
Classify text (and optional image URLs) for harmful content — hate speech, harassment, self-harm, sexual content, violence, and illicit instructions. Returns flagged status, risk level (NONE/LOW/MEDIUM/HIGH), flagged categories, per-category confidence scores, and an optional AI-generated safe rewrite.
Parameters schema
{
"type": "object",
"$schema": "http://json-schema.org/draft-07/schema#",
"properties": {
"text": {
"type": "string",
"description": "Text content to moderate (required unless image_url provided)."
},
"rewrite": {
"type": "boolean",
"description": "If true and content is flagged, return an AI-generated safe rewrite. Adds ~1s latency."
},
"image_url": {
"type": "string",
"description": "Optional public image URL to moderate alongside text."
}
},
"additionalProperties": false
}No endpoints wrapped at confidence ≥ 0.70.
Parent server
The Stall
1/7 registries