Real-time speech transcription and translation powered by Mistral AI (Voxtral) and DeepL.
Speak into your microphone and get live transcription with instant translation across 11 languages.
Supported Languages
French, English, Chinese, Spanish, Portuguese, Russian, German, Japanese, Korean, Italian, Dutch
Prerequisites
- Node.js 22+
- A Mistral AI API key
- A DeepL API key (Free plan works)
Setup
# Install dependencies npm install # Copy and fill in your API keys cp .env.example .env # Build the project npm run build # Start the server node server.mjs
The app will be available at http://localhost:4003.
API keys can also be entered directly in the browser via the settings modal (stored in localStorage).
Docker
docker build -t voxtral-live-translation .
docker run -p 4003:4003 --env-file .env voxtral-live-translationHow It Works
- Microphone capture - Audio is captured via the Web Audio API and streamed as PCM to the server
- Real-time transcription - The server forwards audio to Mistral's Voxtral realtime WebSocket API for live speech-to-text
- Translation - Transcribed segments are sent to DeepL for translation into the selected target language
Tech Stack
- Astro with SSR (Node.js adapter)
- Tailwind CSS + DaisyUI
- Mistral AI SDK (Voxtral realtime transcription)
- DeepL Node SDK (text translation)
- WebSocket (via ws) for real-time audio streaming