API documentation
Create YouTube transcriptions, poll job status, and optionally return timestamped transcript segments. Captions are used first, with audio transcription as the fallback.
Quick start
Create a transcription request with a YouTube URL. New work is queued; cached work can complete immediately.
- 1. Set your API key
export CLIPSCRIPT_API_KEY="csk_live_..." - 2. Create a transcriptionMost new videos return
202with{ id, status: "queued" }. A cache hit can return200immediately. - 3. Poll by IDCall
GET /api/v1/transcriptions/:iduntilstatusiscompleteorfailed.
curl -X POST "https://clipscript.uk/api/v1/transcriptions" \
-H "Authorization: Bearer $CLIPSCRIPT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=jNQXAC9IVRw",
"language": "en",
"timestamps": true
}'Authentication
All /api/v1 endpoints require a live API key. Send it as a bearer token.
Authorization: Bearer csk_live_...API responses are JSON except for the media endpoint, which streams video bytes on success.
Endpoints
Create a transcription request. Returns 202 for a queued job, or 200 if the exact video, language, and timestamp mode are already cached.
Fetch the current status. Use this endpoint to retrieve the final transcript and, when requested, segments.
Download a single YouTube video file. This is separate from transcription and returns a streamed file, not JSON, on success.
Create a transcription
POST /api/v1/transcriptions accepts a JSON body.
urlstringrequiredAny standard YouTube watch, embed, or youtu.be URL. Playlists are not accepted as a batch; the API extracts a single video ID.
languagestringoptionalRequested output language. Defaults to ar. Common values include ar, en, fr, de, zh, ja, ko, ru, hi, th, and he. Responses include the normalized transcript language as language.
timestampsbooleanoptionalDefaults to false. When true, completed responses include segments. When omitted or false, responses include plain transcript only.
{
"url": "https://www.youtube.com/watch?v=jNQXAC9IVRw",
"language": "en",
"timestamps": false
}Responses
Create can return a queued job or an immediate cached result. Polling returns the final transcript, segments, or failure.
{
"id": "cm4abc1230000qz8x9k1p2l3m",
"status": "queued",
"language": "en"
}{
"id": "cm4abc1230000qz8x9k1p2l3m",
"status": "complete",
"language": "en",
"transcript": "Full transcript text...",
"cached": true
}{
"id": "cm4abc1230000qz8x9k1p2l3m",
"status": "running",
"language": "en",
"timestamps": true
}{
"id": "cm4abc1230000qz8x9k1p2l3m",
"status": "complete",
"language": "en",
"timestamps": false,
"transcript": "Full transcript text..."
}{
"id": "cm4abc1230000qz8x9k1p2l3m",
"status": "complete",
"language": "en",
"timestamps": true,
"transcript": "First line of transcript. Second line...",
"segments": [
{
"startMs": 1000,
"endMs": 3200,
"text": "First line of transcript."
},
{
"startMs": 3300,
"endMs": 6100,
"text": "Second line..."
}
]
}{
"id": "cm4abc1230000qz8x9k1p2l3m",
"status": "failed",
"language": "en",
"timestamps": true,
"error": "No captions available for the requested language"
}Timestamp behavior
Timestamps are opt-in. Omit the field for a plain transcript.
defaulttimestamps=falseoptionaltimestamps is not included by default. If you omit it, completed responses contain only transcript.
segment modetimestamps=trueoptionalEach segment uses millisecond offsets from the beginning of the video: startMs, endMs, and text.
granularitysegmentoptionalFor caption-based transcripts, segments come from YouTube VTT/SRT cues. For audio fallback, segments come from the transcription model. These are not word-level timestamps.
Caching and billing
cache keyvideo + language + modeoptionalCache is scoped by YouTube video ID, normalized language, and timestamp mode. A plain transcript cache hit does not satisfy a later request with "timestamps": true.
lifetimeabout 90 daysoptionalCompleted transcripts are cached for about 90 days. Cache hits return faster and count as a video request, but they do not consume transcription minutes.
Media download
POST /api/v1/media/youtube downloads one YouTube video into a temporary directory, streams it, then deletes the temporary files.
urlstringrequiredYouTube video URL.
height_capnumberoptionalMaximum video height from 144 to 2160. Defaults to 720.
Successful responses set Content-Type based on the file extension and Content-Disposition to an attachment filename. Videos longer than the server limit return 422 video_too_long.
Response headers
Authenticated API responses include current rate and quota state.
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 9
X-Quota-Videos-Remaining: 39
X-Quota-Minutes-Remaining: 238Rate limits are per minute. Quota headers include used and reserved work, so queued jobs are reflected before they finish.
Errors and status codes
401 unauthorizedauthoptionalMissing, malformed, revoked, or non-live API key.
402 quota_exceededquotaoptionalPlan quota is exhausted for the current billing window.
403 forbiddenaccessoptionalThe transcription belongs to another user, or YouTube blocks the download.
404 not_foundmissingoptionalUnknown transcription ID or permanently unavailable video.
422 invalid_urlvalidationoptionalInvalid YouTube URL or video is too long for your plan.
429 rate_limitedrate limitoptionalToo many requests per minute for your plan.
429 concurrency_limit_reachedconcurrencyoptionalToo many in-flight transcriptions. Wait or upgrade.
500 internal_errorserveroptionalUnexpected server error.
{
"error": "rate_limited",
"message": "Request rate exceeded (10/min on the free plan). Retry after 60s or upgrade for a higher rpm.",
"limit": 10,
"used": 11,
"retryAfter": 60
}