Thoughts move fast. Your transcription should keep up.
Voxtral Transcribe streams text into your notes as you speak. Add structure by voice (headings, bullets, to-dos), or grab the keyboard mid-dictation — the mic waits for you and resumes when you stop typing. Edits happen as you go, not after.
Dictate directly into Markdown using Mistral's Voxtral speech-to-text. Insert headings, lists, to-dos and other elements by voice, correct text inline or on the fly, use real-time streaming or batch tap-to-send. Supports transcription in 13+ languages.
Get going in under a minute
- Install and paste your Mistral API key
- Press
Ctrl+Space (desktop) or tap the mic icon (mobile)
- Start talking — say "heading 2", "new bullet", "for the correction: ..." as you go
Why Voxtral?
Voxtral is purpose-built for transcription, not retrofitted from a general audio model. Three things that matter for dictation:
- Low word error rate on hard audio — handles background noise, accents, and technical jargon well, including on continuous speech
- Streaming-first — designed for low-latency partial results, which is what makes "text appears as you speak" feel real-time instead of stuttery
- Multilingual by design — 13+ languages with consistent quality, not English-first with the rest bolted on
If you're choosing between speech-to-text models for dictation specifically (rather than, say, post-hoc transcription of meeting recordings), this is a strong fit.
Features
- Real-time streaming (desktop) — text appears as you speak
- Batch mode with tap-to-send (desktop + mobile) — send audio chunks while you keep talking
- Voice commands — insert headings, bullet points, to-do items, numbered lists, and more by voice
- 13 languages — voice commands automatically adapt to the selected language; English always works as fallback (Dutch, English, French, German, Spanish, Portuguese, Italian, Russian, Chinese, Hindi, Arabic, Japanese, Korean)
- Voice command help panel — shows available commands and trigger phrases for the active language
- Auto-correction — spelling, capitalization, and punctuation are automatically corrected after recording
- Inline correction instructions — say "for the correction: ..." and the corrector will follow your instructions
- Self-correction recognition — "no not X but Y" is handled automatically
- Mishearing correction — common speech recognition errors are fixed automatically per language
- Microphone selection — choose which microphone to use
- Auto-pause on focus loss — configurable behavior when switching apps on mobile
- Configurable Enter-to-send — optionally use Enter as tap-to-send when the mic is live (batch mode)
- Typing cooldown — adjustable delay before the mic resumes after typing
Need coffee to process all this? Me too. ☕ Buy Me a Coffee
Requirements
Installation
- Open Settings > Community plugins > Browse
- Search for "Voxtral Transcribe"
- Click Install, then Enable
- Go to Settings > Voxtral Transcribe and enter your Mistral API key
Manual installation
- Download
main.js, manifest.json, and styles.css from the latest release
- Create a folder
.obsidian/plugins/voxtral-transcribe/ in your vault
- Copy the three files into that folder
- Restart Obsidian and enable the plugin in Settings > Community plugins
Usage
Desktop (real-time mode)
- Open a note
- Click the microphone icon in the ribbon, or press Ctrl+Space
- Start speaking — text appears live in your note
- Click the microphone again or say "stop recording" to stop
- Auto-correction runs automatically if enabled
Mobile (batch mode)
On mobile, only batch mode is available (real-time streaming requires Node.js).
- Open a note
- Tap the microphone icon to start recording
- Tap the send icon in the view header to transcribe the current audio chunk — the recording keeps going
- On desktop, press Enter while the mic is live (not typing) to send a chunk (if Enter = tap-to-send is enabled)
- Keep talking and tap/press send again for the next chunk
- Tap the microphone to stop — the last chunk is processed automatically
Voice commands
Voice commands are recognized at the end of a sentence. Commands automatically adapt to the language selected in settings — the table below shows examples in English, but equivalent phrases are available in all 13 supported languages. Open the Voice Commands help panel (ribbon icon or command palette) to see the exact phrases for your active language.
| Command |
Example (English) |
Result |
| New paragraph |
"new paragraph" |
Double line break |
| New line |
"new line" |
Single line break |
| Heading 1–3 |
"heading 1" / "heading 2" / "heading 3" |
# / ## / ### |
| Bullet point |
"bullet point" |
- |
| To-do item |
"new todo" |
- [ ] |
| Numbered item |
"numbered item" |
1. (auto-increments) |
| Delete last paragraph |
"delete last paragraph" |
Removes last paragraph |
| Delete last line |
"delete last line" |
Removes last sentence |
| Undo |
"undo" |
Undo last action |
| Stop recording |
"stop recording" |
Stops the recording |
Text correction
- Correct selection: Select text > Command palette > "Correct selected text"
- Correct entire note: Command palette > "Correct entire note"
Focus loss behavior
When switching apps on mobile, you can configure what happens to an active recording:
- Pause immediately (default) — pauses and resumes when you return
- Pause after delay — keeps recording for a configurable time (10s–5min), then pauses
- Keep recording — continues recording in the background
Settings
| Setting |
Description |
| Mistral API key |
Your API key from console.mistral.ai |
| Microphone |
Which microphone to use |
| Mode |
Realtime (desktop only) or Batch |
| Enter = tap-to-send |
Use Enter to send audio chunks when mic is live (batch mode, default: on) |
| Typing cooldown |
Delay before mic resumes after typing (default: 800 ms) |
| On focus loss |
Pause immediately / after delay / keep recording |
| Language |
Language for transcription and voice commands (13 languages, default: Nederlands) |
| Auto-correct |
Enable/disable automatic correction |
| Streaming delay |
Latency vs accuracy tradeoff for realtime mode |
Development
npm install
npm run dev # watch mode
npm run build # production build
License
GPL-3.0 — Copyright (c) 2026 Max Kloosterman