vision-analyze

Active

declared in 4.82.0

Analyze any image URL using GPT-4o-mini vision. Returns structured analysis based on the mode: describe (full description), ocr (text extraction), chart (data/trend extraction), ui (interface analysis), identify (object/subject ID), or qa (answer a specific question about the image). Input must be a publicly accessible image URL (JPEG, PNG, GIF, WebP). $0.050/call.

Parameters schema

{
  "type": "object",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "properties": {
    "url": {
      "type": "string",
      "description": "Publicly accessible URL of the image to analyze. Must return image/jpeg, image/png, image/gif, or image/webp content-type. Max file size: 20MB."
    },
    "mode": {
      "type": "string",
      "description": "Analysis mode: describe (full scene description), ocr (text extraction), chart (data/chart analysis), ui (UI screenshot analysis), identify (object/subject identification), qa (answer a specific question about the image — requires the 'question' parameter)."
    },
    "detail": {
      "type": "string",
      "description": "OpenAI vision detail level. 'auto' (default): model decides based on image size. 'low': faster, cheaper, less detail (best for simple images). 'high': slower, more detail (best for charts, dense text, complex scenes)."
    },
    "question": {
      "type": "string",
      "description": "For mode=qa only: the specific question to answer about the image. E.g., 'What is the total revenue shown in Q3?' or 'What does the error message say?'"
    }
  },
  "additionalProperties": false
}

What this tool wraps· 0 endpoints

min confidence0.70 0.50

No endpoints wrapped at confidence ≥ 0.70.

Parent server

The Stall

1/7 registries

View full server →