GitHub - alxlion/voxtral-live-translation: Experimentation with Voxtral-Mini-4B-Realtime-2602 and DeepL API for live translation

Real-time speech transcription and translation powered by Mistral AI (Voxtral) and DeepL.

Speak into your microphone and get live transcription with instant translation across 11 languages.

Supported Languages

French, English, Chinese, Spanish, Portuguese, Russian, German, Japanese, Korean, Italian, Dutch

Prerequisites

Node.js 22+
A Mistral AI API key
A DeepL API key (Free plan works)

Setup

# Install dependencies
npm install

# Copy and fill in your API keys
cp .env.example .env

# Build the project
npm run build

# Start the server
node server.mjs

The app will be available at http://localhost:4003.

API keys can also be entered directly in the browser via the settings modal (stored in localStorage).

Docker

docker build -t voxtral-live-translation .
docker run -p 4003:4003 --env-file .env voxtral-live-translation

How It Works

Microphone capture - Audio is captured via the Web Audio API and streamed as PCM to the server
Real-time transcription - The server forwards audio to Mistral's Voxtral realtime WebSocket API for live speech-to-text
Translation - Transcribed segments are sent to DeepL for translation into the selected target language

Tech Stack

Astro with SSR (Node.js adapter)
Tailwind CSS + DaisyUI
Mistral AI SDK (Voxtral realtime transcription)
DeepL Node SDK (text translation)
WebSocket (via ws) for real-time audio streaming