framefetch_extract
ActiveTool of io.github.MarvinRey7879/framefetch
Extract structured data from ONE public social-video URL (YouTube incl. Shorts, TikTok, Instagram Reels, Pinterest, Reddit). Purpose: turn a video link into metadata (title, author, duration, date), insights (views/likes/comments), a transcript (captions, or Whisper when there are none — works on TikTok/Reddit too), parametrically-sampled video frames, and/or the on-screen text burned into those frames (OCR — captions, price tags, signage, lower-thirds). When to use: you have a video URL and need its text, stats, frames, or on-screen text for analysis, summarization, or grounding a model. When NOT to use: non-video pages, private/login-walled content, or bulk crawling (one URL per call). Returns: one JSON object containing only the requested fields plus a `cost` block (micro-USD). Frames come back as time-limited signed image URLs; text_overlay returns one entry per frame with the OCR text, per-line confidence, and bounding boxes. Cost/latency: metadata is sub-cent and fast; transcript is billed per audio-minute, frames per frame, and text_overlay per frame on top of that (all three also incur bandwidth for frames) — request only the fields you need and downscale frames via `width` to control cost. Billing: a free tier covers light use; agents can also pay per call with x402 (USDC) with no account. Example: { "url": "https://www.youtube.com/watch?v=...", "fields": ["metadata","transcript"], "frames": { "mode": "fps", "fps": 1, "width": 480 } }
Parameters schema
{
"type": "object",
"required": [
"url"
],
"properties": {
"url": {
"type": "string",
"description": "Public video URL from a supported platform."
},
"fields": {
"type": "array",
"items": {
"enum": [
"metadata",
"insights",
"transcript",
"frames",
"text_overlay"
],
"type": "string"
},
"description": "Which data to return. Default [\"metadata\"]. Request the minimum you need. \"text_overlay\" (on-screen text via OCR) requires \"frames\" to also be requested, and is auto-skipped (with a warning, not an error) above 200 estimated frames — OCR runs a subprocess per frame, so keep the frames spec narrow (fps/range) when you want it."
},
"frames": {
"type": "object",
"required": [
"mode"
],
"properties": {
"n": {
"type": "number",
"description": "every_n: take every Nth source frame"
},
"to": {
"type": "number",
"description": "range: end second"
},
"fps": {
"type": "number",
"description": "fps/range: frames per second (max 60)"
},
"from": {
"type": "number",
"description": "range: start second"
},
"mode": {
"enum": [
"all",
"every_n",
"fps",
"range"
],
"type": "string"
},
"width": {
"type": "number",
"description": "downscale width in px (16–7680)"
},
"format": {
"enum": [
"jpg",
"png",
"webp"
],
"type": "string"
}
},
"description": "Frame-sampling spec; required when \"frames\" or \"text_overlay\" is requested."
},
"verbosity": {
"enum": [
"concise",
"detailed"
],
"type": "string"
}
}
}Parent server
io.github.MarvinRey7879/framefetch
https://github.com/MarvinRey7879/framefetch
1/7 registries