Local Dictation

Dictate notes with Whisper or Cohere Transcribe; clean up with a local Ollama model. Private, on-device speech-to-text for Obsidian.

Add to Obsidian

Overview
Scorecard
Updates10

On-device dictation plugin for Obsidian. Talk directly into your notes with fast accurate transcription from top models running on your CPU or GPU.

Transcription runs entirely on your device. No accounts or cloud required. A fast rust sidecar handles the inference and all models can be downloaded directly in the settings.

Features

Top models, run locally. Whisper or Cohere Transcribe, on CPU or GPU. Cohere Transcribe currently tops the Open ASR Leaderboard.
Live dictation. Moonshine streaming models show provisional words within about a second and revise them in place until each utterance finalizes.
Speaker labels. Optional on-device diarization tags who's talking — for interviews, meetings, and calls. Nothing is stored; voiceprints live in memory for the session only.
System audio. Transcribe your computer's output — meetings, calls, videos — not just your mic. Available on Windows and Linux.
Timestamps. Optionally stamp phrases with elapsed or wall-clock time — handy for meetings and interviews.
LLM presets. Clean up, summarize, pull out action items, or reshape a transcript with built-in or custom presets, run through a local model (Ollama) or OpenRouter.
Auto routing. Keep cleanup fully local, or have only oversized transcripts route automatically to OpenRouter for a bigger model and larger context window.
Runs on your hardware. Metal on macOS, CUDA on recent NVIDIA GPUs, CPU everywhere else.

🚀 Getting started

Install Local Dictation from Obsidian's Community Plugins. A setup wizard downloads the engine and a starter model on first launch.

Then click the microphone in the ribbon, or bind a hotkey to Local Dictation: Toggle dictation, and start talking. Text lands at your cursor.

Platform support

CPU works everywhere with no extra setup. Hardware acceleration is available for faster transcription — use Metal (macOS, automatic) or CUDA on a recent NVIDIA GPU (RTX 20-series / GTX 16-series or newer, with a current driver). See the CUDA setup guide to enable it.

macOS and Windows are the primary tested targets. On Linux, the plugin is used daily on Fedora 44 (native and Flatpak); other distributions should work but aren't routinely verified. If something breaks on yours, open an issue.

Moonshine live dictation is English-only. Streaming models do not apply speaker labels, and long speech is split at a 30-second utterance cap. Tiny is intended for lower-end CPUs, Small is the recommended balance, and Medium trades additional compute and memory for accuracy. Batch models keep their existing behavior.

🔒 Privacy

Your audio never leaves your device — transcription is always local. The sidecar and models download once from GitHub Releases, and model files live outside your vault.

Local LLMs are limited in capability, so you can route your transcribed text to OpenRouter for frontier models and much larger context windows. Restrict it to ZDR endpoints and approved providers in OpenRouter to match your own privacy standards. Remote LLM features turn off with a single toggle, and a second toggle disables all LLM features in the plugin.

Contributing

A TypeScript plugin paired with a rust sidecar for inference. See CONTRIBUTING.md for the architecture, setup, and workflow.

License

Local Dictation is MIT-licensed — see LICENSE.

The models bundled in the sidecar are openly licensed too: Silero VAD (MIT) for voice activity detection, plus the diarization models WeSpeaker (CC-BY-4.0) and pyannote segmentation (MIT). See Full attributions THIRD_PARTY_NOTICES.md.

HealthExcellent

ReviewSatisfactory

About

Run private, GPU-accelerated dictation inside Obsidian using Whisper or Cohere Transcribe with Silero VAD for accurate speech boundary detection on macOS, Linux, and Windows. Process transcriptions locally with an optional Ollama LLM, manage models with one click, and keep everything offline for privacy.

Audio AI Writing

Details

Current version

2026.7.1

Last updated

3 hours ago

Created

3 months ago

Updates

10 releases

Downloads

415

Compatible with

Obsidian 1.11.5+

Platforms

Desktop only

License

MIT

Author

Alexander Brittainbrittain9

brittain9

Features

Top models, run locally. Whisper or Cohere Transcribe, on CPU or GPU. Cohere Transcribe currently tops the Open ASR Leaderboard.

Live dictation. Moonshine streaming models show provisional words within about a second and revise them in place until each utterance finalizes.

Speaker labels. Optional on-device diarization tags who's talking — for interviews, meetings, and calls. Nothing is stored; voiceprints live in memory for the session only.

System audio. Transcribe your computer's output — meetings, calls, videos — not just your mic. Available on Windows and Linux.

Timestamps. Optionally stamp phrases with elapsed or wall-clock time — handy for meetings and interviews.

LLM presets. Clean up, summarize, pull out action items, or reshape a transcript with built-in or custom presets, run through a local model (Ollama) or OpenRouter.

Auto routing. Keep cleanup fully local, or have only oversized transcripts route automatically to OpenRouter for a bigger model and larger context window.

Runs on your hardware. Metal on macOS, CUDA on recent NVIDIA GPUs, CPU everywhere else.

Platform support

🔒 Privacy

Your audio never leaves your device — transcription is always local. The sidecar and models download once from GitHub Releases, and model files live outside your vault.

Local Dictation

Features

🚀 Getting started

Platform support

🔒 Privacy

Contributing

License

Local Dictation

Features

🚀 Getting started

Platform support

🔒 Privacy

Contributing

License

Related plugins

Text Generator

Smart Composer

Local GPT

ChatGPT MD

BMO Chatbot

GPT-3 Notes

Ollama

Smart Connections

Claudian

Copilot