Search...Search plugins and themes...
⌘K
Sign in
  • Get started
  • Download
  • Pricing
  • Enterprise
  • Account
  • Obsidian
  • Overview
  • Sync
  • Publish
  • Canvas
  • Mobile
  • Web Clipper
  • CLI
  • Learn
  • Help
  • Developers
  • Changelog
  • About
  • Roadmap
  • Blog
  • Resources
  • System status
  • License overview
  • Terms of service
  • Privacy policy
  • Security
  • Community
  • Plugins
  • Themes
  • Discord
  • Forum / 中文论坛
  • Merch store
  • Brand guidelines
Follow us
DiscordTwitterBlueskyThreadsMastodonYouTubeGitHub
© 2026 Obsidian

GemmaNotes

Sarath ChandraSarath Chandra62 downloads

Push-to-talk voice notes transcribed on-device with Gemma 4.

Add to Obsidian
GemmaNotes screenshot
  • Overview
  • Scorecard
  • Updates6

Push-to-talk voice notes for Obsidian, transcribed entirely on-device with Google's Gemma 4 (E2B / E4B) running in-process via transformers.js. No server, no API key — fully offline after a one-time model download.

🚀 Install here: https://community.obsidian.md/plugins/gemmanotes

Built with Love and Antigravity CLI, Gemini 3.5 Flash

      ▄▀▀▄        Antigravity CLI 1.0.2
     ▀▀▀▀▀▀       xxxxxxxxxxxxxxxxxxxxx
    ▀▀▀▀▀▀▀▀      Gemini 3.5 Flash (High)
   ▄▀▀    ▀▀▄     ~/gemmanotes
  ▄▀▀      ▀▀▄

How it works

  1. Toggle recording with the ribbon mic button or the Toggle voice recording command (bind a hotkey to it).
  2. On stop, a placeholder is pinned into the current note.
  3. Audio is decoded to 16 kHz mono, split into ≤25 s chunks, and transcribed by Gemma 4. Transcriptions run through a FIFO queue, so you can start the next recording while one is still processing.
  4. The placeholder is swapped for the transcribed text.
  5. A status-bar hint offers a one-click rewrite into tidy prose.
  6. Network use: Only needed for downloading and caching models from huggingface.

Local Setup

Git clone:

gh clone https://github.com/sarath/gemmanotes

Development:

npm install
npm run build      # production bundle -> main.js
npm run dev        # watch mode

Install into a vault for local testing:

Mac

git clone https://github.com/sarath/gemmanotes
./install.sh /path/to/your/vault <feature-branch>

This fetches origin, checks out origin/<feature-branch> cleanly, runs npm run build, and copies main.js, manifest.json, and styles.css into <vault>/.obsidian/plugins/gemmanotes/. Then enable the plugin and use Settings → GemmaNotes → Download to fetch the model (~3.2 GB for E2B, ~5 GB for E4B).

Local Testing and Development

If you are developing inside Google Cloud Shell and want to deploy, test, and debug the plugin directly in a local Obsidian instance:

1. Tunnel Remote Debugging Port

Start your local Obsidian instance with remote debugging enabled (e.g., obsidian --remote-debugging-port=9222). Then, run the following command to tunnel port 9222 from your local machine to your Cloud Shell workspace:

gcloud cloud-shell ssh --ssh-flag="-L 9222:localhost:9222"

This allows scripts running in the Cloud Shell environment to connect to the Chrome DevTools protocol on your local Obsidian instance.

2. Build, Deploy, and Test

  1. Compile the production bundle:
    npm run build
    
  2. Deploy the files (main.js, manifest.json, styles.css) to the connected Obsidian instance and automatically reload the plugin:
    npm run local:deploy
    
  3. Test model loading and verify initialization:
    npm run local:test-load
    

Known v1 limitations

  • Fixed-window chunking — long recordings are cut on a 25 s boundary, which can split a word. Silence-aware chunking is the planned next step.
  • transformers.js Gemma 4 audio API — the processor/model call in src/transcriber.ts follows the documented image-text-to-text pattern; the exact audio surface should be confirmed against the model card sample code.
  • Undo granularity — placeholder→text and rewrite swaps are written via the vault API; they are not always a single editor-undo step.
  • Model cache — stored in the browser Cache API. Persistent and offline-safe, but not yet relocated to the plugin data dir.
  • Desktop only — the model size and WebGPU requirement rule out mobile.
97%
HealthExcellent
ReviewPassed
About
Record push-to-talk voice notes and transcribe them locally with Google's Gemma 4 via transformers.js. Pin a placeholder in the current note and replace it with the transcript when ready; queue transcriptions so you can record again while processing. Work fully offline after a one-time model download—no server or API key required.
AudioAI
Details
Current version
0.1.3
Last updated
Last week
Created
2 weeks ago
Updates
6 releases
Downloads
62
Compatible with
Obsidian 1.5.0+
Platforms
Desktop only
License
MIT
Report bugRequest featureReport plugin
Author
Sarath ChandraSarath Chandrasarath
GitHubsarath
  1. Community
  2. Plugins
  3. Audio
  4. GemmaNotes

Related plugins

Smart Connections

Find related notes and excerpts while writing. Your link building copilot displays relevant content in graph + list view. A local embedding model powers semantic search. Zero setup. No API key.

Copilot

Your AI Copilot: Chat with Your Second Brain, Learn Faster, Work Smarter.

Claudian

Embeds Claude Code/Codex as an AI collaborator in your vault. Your vault becomes agent's working directory, giving it full agentic capabilities: file read/write, search, bash commands, and multi-step workflows.

Fast Note Sync

Real-time sync of your vaults across server, mobile, and web; shareable with anyone; supports REST and MCP integrations to build your personal AI knowledge base.

Agent Client

Chat with Claude Code, Codex, Gemini CLI, and more via the Agent Client Protocol — right from your vault.

Text Generator

Generate text content using GPT-3 (OpenAI).

Smart Composer

AI chat with note context, smart writing assistance, and one-click edits for your vault.

HiNote

Add comments to highlighted notes, use AI for thinking, and flashcards for memory.

Smart Second Brain

Interact with your privacy focused assistant by leveraging Ollama or OpenAI and making your second brain even smarter.

Khoj

An AI personal assistant for your digital brain.