Seshat VTT

Transcribe audio files with multiple STT providers and insert transcripts beneath audio links in your notes.

Overview
Scorecard
Updates1

Transcribe audio files from Obsidian notes using multiple providers and insert transcript text directly under the audio reference.

Supported providers

OpenAI (/v1/audio/transcriptions)
Google Gemini (/v1beta/models/{model}:generateContent)
Groq (/openai/v1/audio/transcriptions)
Deepgram (/v1/listen)
AssemblyAI (/v2/upload + /v2/transcript)
Rev AI (/speechtotext/v1/jobs)
Speechmatics (/v2/jobs)
OpenAI-compatible custom endpoint ({base}/audio/transcriptions)

Usage

Open plugin settings and select your active provider.
Configure only the fields shown for that provider.
Open the markdown note you want to process.
Click the ribbon audio icon (Transcribe audio in current note).
The plugin scans only the currently open markdown note, transcribes audio references in that note, and inserts transcripts below each matching audio reference.

If the current note has no supported audio links, no transcripts are added.

Community Submission Disclosures

This plugin sends audio content (and optional prompt/language hints) to the configured third-party transcription provider over the network.
Using this plugin typically requires provider accounts and API keys, and may incur provider charges.
The plugin itself does not include telemetry, ads, or a self-update mechanism.
Provider API keys are stored in the plugin's local Obsidian data file (data.json) on your device. Do not commit that file to Git.

Notes

Request/poll timing is fixed to reasonable defaults in code and is no longer exposed in settings.
Use Default language and Prompt as hints for supported providers.
Settings now show only options for the currently selected provider.
Dynamic model dropdowns auto-refresh for OpenAI, Gemini, Groq, and OpenAI-compatible providers.
The ribbon action processes only the currently open markdown note.
Repeated references to the same audio file in a run reuse the same transcript API result to avoid duplicate charges.

Release

GitHub Actions release workflow: .github/workflows/release.yml
Required release assets: main.js, manifest.json, and styles.css
Tag name must exactly match manifest.json.version (no v prefix)

57%

HealthExcellent

ReviewRisks

About

Transcribe audio files in the current open note using multiple providers (OpenAI, Google Gemini, Groq, Deepgram, AssemblyAI, Rev AI, Speechmatics, or OpenAI-compatible endpoints). Insert transcript text directly beneath each audio reference and reuse results for repeated references to avoid duplicate charges. Process only the active markdown note.

AI Attachments Integrations

Details

Current version

0.1.0

Last updated

3 months ago

Created

3 months ago

Updates

1 release

Downloads

Compatible with

Obsidian 1.5.0+

Platforms

Desktop, Mobile

License

MIT

Author

thematthiasleitner

github.com/thematthiasleitner

thematthiasleitner

Supported providers

OpenAI (/v1/audio/transcriptions)

Google Gemini (/v1beta/models/{model}:generateContent)

Groq (/openai/v1/audio/transcriptions)

Deepgram (/v1/listen)

AssemblyAI (/v2/upload + /v2/transcript)

Rev AI (/speechtotext/v1/jobs)

Speechmatics (/v2/jobs)

OpenAI-compatible custom endpoint ({base}/audio/transcriptions)

Usage

Open plugin settings and select your active provider.

Configure only the fields shown for that provider.

Open the markdown note you want to process.

Click the ribbon audio icon (Transcribe audio in current note).

The plugin scans only the currently open markdown note, transcribes audio references in that note, and inserts transcripts below each matching audio reference.

If the current note has no supported audio links, no transcripts are added.

Community Submission Disclosures

This plugin sends audio content (and optional prompt/language hints) to the configured third-party transcription provider over the network.

Using this plugin typically requires provider accounts and API keys, and may incur provider charges.

The plugin itself does not include telemetry, ads, or a self-update mechanism.

Provider API keys are stored in the plugin's local Obsidian data file (data.json) on your device. Do not commit that file to Git.

Notes

Request/poll timing is fixed to reasonable defaults in code and is no longer exposed in settings.

Use Default language and Prompt as hints for supported providers.

Settings now show only options for the currently selected provider.

Dynamic model dropdowns auto-refresh for OpenAI, Gemini, Groq, and OpenAI-compatible providers.

The ribbon action processes only the currently open markdown note.

Repeated references to the same audio file in a run reuse the same transcript API result to avoid duplicate charges.

Seshat VTT

Supported providers

Usage

Community Submission Disclosures

Notes

Release

Seshat VTT

Supported providers

Usage

Community Submission Disclosures

Notes

Release

Related plugins

Agent Client

Smart Composer

Local GPT

Image auto upload

Whisper

Nexus AI Chat Importer

Snipd Official

BMO Chatbot

Local REST API & MCP Server

Copilot