🌐 Translate Video to Audio

Translate Any Video and Get the Audio in Your Language

Paste a video URL or upload a file. Dokitscript transcribes it, translates it into your target language, and generates a natural AI voice MP3 you can download — powered by ElevenLabs.

TikTok · Instagram · YouTube · Facebook · X · LinkedIn · Last updated June 2026

Try video translation free →
No sign-up for your first transcription  ·  Audio download requires Starter plan

How do I translate a video and get the audio in another language? Paste the video URL (or upload a file) into Dokitscript, wait for the transcript, use the AI Translation feature to select your target language, then click Listen. Dokitscript uses ElevenLabs' eleven_multilingual_v2 model to generate a natural AI voice reading the translated text, and produces a downloadable 128 kbps MP3. Transcription runs on OpenAI Whisper and auto-detects the source language (90+ languages supported); translated audio is available in approximately 29 languages and requires the Starter plan or higher.

How do I translate a video to audio in 4 steps?

No software to install. Works entirely in your browser.

1

Paste a URL or upload a file

Paste a TikTok, Instagram, YouTube, Facebook, X, or LinkedIn video URL — or upload a local audio or video file up to 50 MB.

2

Transcription with OpenAI Whisper

Dokitscript transcribes the video in 90+ languages. The source language is detected automatically — no need to specify it.

3

Translate into your target language

Click AI Translation and choose your target language — French, Spanish, Japanese, German, Arabic, and more. The full transcript is translated in seconds.

4

Click Listen — download the translated MP3

ElevenLabs reads the translated text in a natural AI voice. Download the result as a 128 kbps MP3 file in your chosen language.

What does the translation-to-audio pipeline include?

Everything from foreign-language video to translated MP3, in one tool.

🌐

Auto-detect the source language

No need to know what language the video is in. OpenAI Whisper identifies it automatically and transcribes it accurately across 90+ languages.

🔤

AI translation built in

The translation step is powered by Claude AI and produces natural, fluent translated text before it is converted to spoken audio.

🎙️

Natural AI voice via ElevenLabs

The translated text is read by ElevenLabs' eleven_multilingual_v2 model — one of the most natural-sounding multilingual AI voices available.

⬇️

Downloadable translated MP3 at 128 kbps

The output is a standard MP3 file in your target language, ready to play on any device or import into a video editor, podcast tool, or learning app.

🔗

All major platforms supported

Paste a URL from TikTok, Instagram Reels, YouTube Shorts, YouTube, Facebook, X (Twitter) or LinkedIn. File upload also works for local recordings.

📝

Translated text transcript included

You always get the full written transcript in both the source and target language alongside the MP3. Export as TXT or SRT any time.

Which languages are available for translated audio?

Transcription and audio generation cover different language sets — here is the honest breakdown.

Transcription (source language) — 90+ languages via OpenAI Whisper

Dokitscript can transcribe speech in over 90 languages, including English, French, Spanish, Arabic, Chinese, Hindi, Japanese, Korean, Portuguese, German, Italian, Turkish, Russian, and many more. The source language is detected automatically.

Translated audio output — ~29 languages via ElevenLabs

The translated MP3 voice output is powered by ElevenLabs and currently supports approximately 29 target languages:

English French Spanish German Italian Portuguese Polish Turkish Russian Dutch Czech Arabic Chinese Japanese Korean Hindi Indonesian Filipino Swedish Bulgarian Romanian Greek Finnish Croatian Slovak Danish Tamil Ukrainian

Note: transcription supports 90+ source languages; translated audio output supports ~29 target languages. If your target language is not in the audio list, you will still get the fully translated text transcript.

Who needs to translate video into audio?

Anywhere a foreign-language video needs to be heard in a different language.

Understanding foreign-language videos

Found a TikTok or YouTube video in a language you don't speak? Translate it and listen to the audio in your language instead of reading subtitles.

Language learning with real content

Translate a video into your target language and listen to it as an MP3. Training your ear with real-world content is more effective than textbook exercises.

Reaching international audiences

Turn a video you created in English into an audio version in French, Spanish, or Japanese. Share the MP3 as a companion track for international followers.

Multilingual podcast episodes

Translate an episode transcript and generate a voiceover in the target language. Publish it as a bonus episode for your audience in another country.

Corporate training across regions

Translate recorded training videos into the local language of each team. Distribute the audio files without re-recording sessions from scratch.

Accessible content for all

Convert a foreign-language video into an audio file in the listener's own language — useful for accessibility tools, commutes, and low-bandwidth environments.

What this feature does not do: It does not replace or dub the audio inside the original video file, synchronize the generated voice to on-screen lips (lip-sync dubbing), clone the original speaker's voice, or offer a choice between multiple AI voices. The output is a standalone translated MP3 audio file — a voiceover in your language, not a dubbed video.

How many minutes of translated audio do I get?

Transcription and AI translation are available on every plan. Translated audio download requires Starter or higher.

Plan Price Transcriptions Max video length Translated audio (MP3)
Free $0 5 / month 3 minutes Not available
Starter $4.99 / mo 200 / month 8 minutes 6 min / month
Pro $14.99 / mo Unlimited 45 minutes 60 min / month
Business $79.99 / mo Unlimited 5 hours 240 min / month

Audio minutes are counted per generated MP3. Unused minutes do not roll over. See full pricing →

Translate Video to Audio — Common Questions

Paste the video URL (or upload a file) into Dokitscript. Once the transcript is ready, use the AI Translation feature to translate it into your target language, then click Listen. Dokitscript generates a natural AI voice MP3 via ElevenLabs that you can download. The full pipeline — transcribe, translate, audio — takes a couple of minutes.
You can paste URLs from TikTok, Instagram Reels, YouTube (including Shorts), Facebook, X (Twitter), and LinkedIn. You can also upload local audio or video files (MP3, WAV, M4A, MP4, WebM — up to 50 MB).
Transcription supports 90+ languages via OpenAI Whisper. The translated MP3 audio output is powered by ElevenLabs eleven_multilingual_v2 and supports approximately 29 target languages: English, French, Spanish, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, Korean, Hindi, Indonesian, Filipino, Swedish, Bulgarian, Romanian, Greek, Finnish, Croatian, Slovak, Danish, Tamil, and Ukrainian.
The translated audio is exported as a 128 kbps MP3 file — suitable for playback, language learning, podcast drafts, or as a voiceover scratch track.
No. The feature produces a standalone translated MP3 audio file read by a natural AI voice. It does not replace the audio inside the original video file, does not synchronize lips in the video (lip-sync dubbing), and does not clone the original speaker's voice. The output is a voiceover audio file — not a dubbed video.
Transcription and AI text translation are available on the Free plan. The MP3 audio generation step requires Starter or higher. Starter includes 6 minutes of audio per month ($4.99/mo), Pro includes 60 minutes ($14.99/mo), and Business includes 240 minutes ($79.99/mo).
Yes. OpenAI Whisper auto-detects the spoken language in the video — you do not need to specify it. Once the transcript appears, you choose the target language for translation and audio generation.
Yes. The Free plan includes 5 transcriptions per month and up to 3 AI translations per month at no cost. You can verify the translated text before upgrading to Starter ($4.99/month) to download the MP3 audio.

More ways to use Dokitscript

Translate any video into audio today

Free to start. Translated audio download from $4.99/month. No software needed.

Get started free →