Search...Search plugins and themes...
⌘K
Sign in
  • Get started
  • Download
  • Pricing
  • Enterprise
  • Account
  • Obsidian
  • Overview
  • Sync
  • Publish
  • Canvas
  • Mobile
  • Web Clipper
  • CLI
  • Learn
  • Help
  • Developers
  • Changelog
  • About
  • Roadmap
  • Blog
  • Resources
  • System status
  • License overview
  • Terms of service
  • Privacy policy
  • Security
  • Community
  • Plugins
  • Themes
  • Discord
  • Forum / 中文论坛
  • Merch store
  • Brand guidelines
Follow us
DiscordTwitterBlueskyThreadsMastodonYouTubeGitHub
© 2026 Obsidian

Local Dictation

Alexander BrittainAlexander Brittain415 downloads

Dictate notes with Whisper or Cohere Transcribe; clean up with a local Ollama model. Private, on-device speech-to-text for Obsidian.

Add to Obsidian
  • Overview
  • Scorecard
  • Updates10

On-device dictation plugin for Obsidian. Talk directly into your notes with fast accurate transcription from top models running on your CPU or GPU.

Transcription runs entirely on your device. No accounts or cloud required. A fast rust sidecar handles the inference and all models can be downloaded directly in the settings.

Features

  • Top models, run locally. Whisper or Cohere Transcribe, on CPU or GPU. Cohere Transcribe currently tops the Open ASR Leaderboard.
  • Live dictation. Moonshine streaming models show provisional words within about a second and revise them in place until each utterance finalizes.
  • Speaker labels. Optional on-device diarization tags who's talking — for interviews, meetings, and calls. Nothing is stored; voiceprints live in memory for the session only.
  • System audio. Transcribe your computer's output — meetings, calls, videos — not just your mic. Available on Windows and Linux.
  • Timestamps. Optionally stamp phrases with elapsed or wall-clock time — handy for meetings and interviews.
  • LLM presets. Clean up, summarize, pull out action items, or reshape a transcript with built-in or custom presets, run through a local model (Ollama) or OpenRouter.
  • Auto routing. Keep cleanup fully local, or have only oversized transcripts route automatically to OpenRouter for a bigger model and larger context window.
  • Runs on your hardware. Metal on macOS, CUDA on recent NVIDIA GPUs, CPU everywhere else.

🚀 Getting started

Install Local Dictation from Obsidian's Community Plugins. A setup wizard downloads the engine and a starter model on first launch.

Then click the microphone in the ribbon, or bind a hotkey to Local Dictation: Toggle dictation, and start talking. Text lands at your cursor.

Platform support

CPU works everywhere with no extra setup. Hardware acceleration is available for faster transcription — use Metal (macOS, automatic) or CUDA on a recent NVIDIA GPU (RTX 20-series / GTX 16-series or newer, with a current driver). See the CUDA setup guide to enable it.

macOS and Windows are the primary tested targets. On Linux, the plugin is used daily on Fedora 44 (native and Flatpak); other distributions should work but aren't routinely verified. If something breaks on yours, open an issue.

Moonshine live dictation is English-only. Streaming models do not apply speaker labels, and long speech is split at a 30-second utterance cap. Tiny is intended for lower-end CPUs, Small is the recommended balance, and Medium trades additional compute and memory for accuracy. Batch models keep their existing behavior.

🔒 Privacy

Your audio never leaves your device — transcription is always local. The sidecar and models download once from GitHub Releases, and model files live outside your vault.

Local LLMs are limited in capability, so you can route your transcribed text to OpenRouter for frontier models and much larger context windows. Restrict it to ZDR endpoints and approved providers in OpenRouter to match your own privacy standards. Remote LLM features turn off with a single toggle, and a second toggle disables all LLM features in the plugin.

Contributing

A TypeScript plugin paired with a rust sidecar for inference. See CONTRIBUTING.md for the architecture, setup, and workflow.

License

Local Dictation is MIT-licensed — see LICENSE.

The models bundled in the sidecar are openly licensed too: Silero VAD (MIT) for voice activity detection, plus the diarization models WeSpeaker (CC-BY-4.0) and pyannote segmentation (MIT). See Full attributions THIRD_PARTY_NOTICES.md.

HealthExcellent
ReviewSatisfactory
About
Run private, GPU-accelerated dictation inside Obsidian using Whisper or Cohere Transcribe with Silero VAD for accurate speech boundary detection on macOS, Linux, and Windows. Process transcriptions locally with an optional Ollama LLM, manage models with one click, and keep everything offline for privacy.
AudioAIWriting
Details
Current version
2026.7.1
Last updated
3 hours ago
Created
3 months ago
Updates
10 releases
Downloads
415
Compatible with
Obsidian 1.11.5+
Platforms
Desktop only
License
MIT
Report bugRequest featureReport plugin
Author
Alexander BrittainAlexander Brittainbrittain9
GitHubbrittain9
  1. Community
  2. Plugins
  3. Audio
  4. Local Dictation

Related plugins

Text Generator

Generate text content using GPT-3 (OpenAI).

Smart Composer

AI chat with note context, smart writing assistance, and one-click edits for your vault.

Local GPT

Local Ollama and OpenAI-like GPT's assistance for maximum privacy and offline access.

ChatGPT MD

A seamless integration of ChatGPT, OpenRouter.ai and local LLMs via Ollama into your notes.

BMO Chatbot

Generate and brainstorm ideas while creating your notes using Large Language Models (LLMs) such as OpenAI's "gpt-3.5-turbo" and "gpt-4".

GPT-3 Notes

Generate notes on any subject using OpenAI's GPT-3.5 and GPT-4 language models.

Ollama

Enable the usage of Ollama within your notes.

Smart Connections

Find related notes and excerpts while writing. Your AI link building copilot displays relevant content in graph + list view. A local embedding model powers semantic search. Zero setup. No API key.

Claudian

Embeds Claude Code/Codex as an AI collaborator in your vault. Your vault becomes agent's working directory, giving it full agentic capabilities: file read/write, search, bash commands, and multi-step workflows.

Copilot

Your AI Copilot: Chat with Your Second Brain, Learn Faster, Work Smarter.