
A plugin for Obsidian that extracts text from images using OCR powered by AI image recognition.
This is a simple plugin for extremely accurate and reliable text and handwriting recognition in images.
AI models are vastly more effective at text extraction compared to traditional tools such as Tesseract.
Visit the Plugin Wiki for detailed documentation.
[!TIP] The Google Gemini Flash 2.5 free tier (no credit card required)
has a rate limit of 250 RPD (requests per day).
Flash-Lite allows up to 1,000 RPD.
For most users, Gemini is the recommended model family
as it is fast, highly accurate, and free to use.
gpt-4o)gpt-4o-mini)gpt-4.1)gpt-4.1-mini)gpt-4.1-nano)gemini-2.5-flash)gemini-2.5-flash-lite-preview-06-17)gemini-2.5-pro)llava, llava:13b, or bakllava entirely on your machineBring-your-own endpoint support for any service that follows the OpenAI-compatible Chat Completions API
Allows integration with services like:
Specify the full endpoint URL, model ID, and API key (if required)
[!NOTE] Custom providers are untested. Successful use will depend on compatibility with the OpenAI API. User must enter the correct address and model ID. Where applicable a valid API key must also be provided.
{{placeholder}} support{{placeholder}} support{{image.image}} to embed source image in extracted output header/footer[!NOTE] Support for
{{placeholder}}options is still being tested. Unexpected behavior may occur.
Refer to the Wiki for available placeholders. Please report any placeholder issues or suggestions on GitHub.
[!NOTE] This option is not yet available.
If you have the BRAT plugin installed, you can install this plugin using the BRAT plugin manager:
Add beta plugin.https://github.com/rootiest/obsidian-ai-image-ocr
in the Repository URL field.Enable after installing the plugin
checkbox to enable the plugin immediately after installation.Add pluginClone this repository to your vault plugins directory:
git clone https://github.com/rootiest/obsidian-ai-image-ocr.git \
.obsidian/plugins/obsidian-ai-image-ocr
Or download the plugin archive and extract to your plugins directory.
OpenAI, Gemini, Ollama, etc.)gpt-4o, llava:13b, etc.)Several addition optional configuration option are available with which you may customize the output behavior.
{{placeholder}} options are
detailed in the wiki.
Ctrl+P) and search for "Extract text from image".Ctrl+P) and search for
"Extract text from image folder".[!TIP] See the Token Limits Wiki for tips on maximizing token use when extracting from batch images.
[!TIP] You can select an image embed in your note to use it as the source and replace it with the extracted text.
[!NOTE] When using OpenAI:
You must use a user or service account key (not ask-projkey).
The following features are under consideration for future releases of the plugin:
created/modified placeholders for images.{{placeholder}} options.[!NOTE] These goals are exploratory and may evolve based on user feedback and API capabilities. Have a suggestion? Open an issue or discussion on GitHub!
The AI Image OCR Plugin does not collect or store any personal data, images, or extracted text. A proxy server may be used in specific cases to retrieve external images securely. Basic proxy request metadata may be temporarily logged for debugging, but is automatically removed within 7 days.
For full details, see the Privacy & Anonymity Wiki.
Built with ❤️ for Obsidian. Inspired by the limitations of traditional OCR.