Paste a video URL or upload a file. Dokitscript transcribes it, translates it, and generates a natural AI voice MP3 you can download — powered by ElevenLabs.
TikTok · Instagram · YouTube · Facebook · X · LinkedIn · Last updated June 2026
Try video to speech free →How do I turn a video into audio in another language? Paste the video URL (or upload a file) into Dokitscript, wait for the transcript, use the AI Translation feature to translate the text into your target language, then click Listen. Dokitscript uses ElevenLabs' eleven_multilingual_v2 model to generate a natural AI voice and produce a downloadable 128 kbps MP3 file. Transcription runs on OpenAI Whisper and supports 90+ languages; audio generation is available in approximately 29 languages and requires the Starter plan or higher.
How it works
No software to install. Works entirely in your browser.
Paste a TikTok, Instagram, YouTube, Facebook, X, or LinkedIn video URL — or upload an audio or video file up to 50 MB.
Dokitscript transcribes the video in 90+ languages. The spoken language is auto-detected, or you can select it manually.
Use the AI Translation feature to translate the transcript into French, Spanish, Japanese, German, or any of the supported languages.
ElevenLabs generates a natural AI voice reading the translated text. Download the result as a 128 kbps MP3 file.
Features
Everything from URL to MP3, in one tool.
Audio is generated with ElevenLabs' eleven_multilingual_v2 model — one of the most natural-sounding multilingual AI voices available today.
OpenAI Whisper handles the speech-to-text step. It auto-detects the spoken language and supports over 90 languages for transcription.
The translation step runs on Claude AI and produces natural-sounding translated text before it is converted to speech.
The audio output is a standard MP3 file you can download and use in podcasts, video editors, language learning apps, or accessibility tools.
Paste a URL from TikTok, Instagram Reels, YouTube Shorts, YouTube, Facebook, X (Twitter) or LinkedIn. File upload also works for local recordings.
You always get the full written transcript and the translated text alongside the MP3. Export as TXT or SRT at any time.
Languages
Transcription and audio generation cover different language sets — here is the honest breakdown.
Dokitscript can transcribe speech in over 90 languages, including English, French, Spanish, Arabic, Chinese, Hindi, Japanese, Korean, Portuguese, German, Italian, and many more. The spoken language is detected automatically.
The MP3 voice output is powered by ElevenLabs and currently supports approximately 29 languages:
Note: transcription supports 90+ languages; audio generation supports ~29. If your target language is not in the audio list, you will still get the translated text transcript.
Use cases
Anywhere spoken content needs to reach a different language audience.
Turn a TikTok or Instagram Reel into a voiceover in another language. Great for creators who want to reach international audiences without re-recording.
Transcribe a video in a foreign language, translate it, and listen to the MP3 to train your ear. Useful for students and self-learners working with real content.
Convert a written article or transcript into an audio file for users with visual impairments, or for commuters who prefer to listen rather than read.
Translate an episode into a second language and generate a voiceover track. Add it as a bonus episode for your international audience.
Convert recorded lessons or company training videos into audio files in multiple languages for teams across different countries.
Get an AI-voiced MP3 as a scratch track for video projects before hiring a voice actor, saving time in early production stages.
Plans
Transcription and translation are available on every plan. Audio generation requires Starter or higher.
| Plan | Price | Transcriptions | Max video length | Audio generation (MP3) |
|---|---|---|---|---|
| Free | $0 | 5 / month | 3 minutes | Not available |
| Starter | $4.99 / mo | 200 / month | 8 minutes | 6 min / month |
| Pro | $14.99 / mo | Unlimited | 45 minutes | 60 min / month |
| Business | $79.99 / mo | Unlimited | 5 hours | 240 min / month |
Audio minutes are counted per generated MP3. Unused minutes do not roll over. See full pricing →
FAQ
Related tools
Free to start. Audio generation from $4.99/month. No software needed.
Get started free →