OpenWispr

Open source AI voice dictation that processes locally or uses your API keys. 3x faster than typing with smart text cleanup and universal app compatibility.

At a Glance:

OpenWhispr is an open-source, privacy-first voice-to-text dictation and AI meeting transcription tool that works fully offline with local models like Whisper and NVIDIA Parakeet, or with cloud providers, and offers a public API and MCP server for programmatic access.

Overview:

OpenWhispr is a cross-platform desktop application for voice dictation, AI-assisted actions, and meeting transcription. It turns spoken words into text at the cursor position through a global hotkey, supports interaction with various AI models for reasoning, and can automatically transcribe meetings from calls on platforms like Zoom, Teams, and FaceTime with local speaker diarization. A core design choice is its dual-mode operation: users can keep all audio and processing on-device using local speech-to-text and LLM engines, or use cloud services for increased speed. The application also includes a note-taking system with organizational features, semantic search, and sync capabilities.

Key Decision Points:

Fully local or cloud processing: Users can choose to run transcription, AI reasoning, and speaker diarization entirely on-device using engines like whisper.cpp and llama.cpp, or process data via cloud providers, with no enforced telemetry or data collection.
Desktop-centric with global hotkey: The tool operates as a system-wide desktop application (Electron) for macOS, Windows, and Linux, activated by a global hotkey to dictate into any active application and automatically paste the resulting text.
Programmatic access via API and MCP: Beyond the desktop UI, notes and transcriptions can be managed through a public API, and an MCP server allows external AI assistants to connect and interact with the application's data.
Integrated meeting transcription: The software can auto-detect supported calling apps and perform live transcription with on-device speaker diarization, including voice fingerprinting and Google Calendar integration for context.

Core Features:

Global voice dictation: Triggers speech-to-text from anywhere on the system with a hotkey, transcribing audio and pasting it into the active field.
AI agent with multi-model support: Allows conversation with cloud models such as GPT-5, Claude, Gemini, and Groq, or local models, acting as a named voice assistant.
Auto-detecting meeting transcription: Automatically identifies calls from Zoom, Teams, and FaceTime to provide live transcriptions with speaker labelling.
Local speaker diarization: Performs on-device speaker identification using voice fingerprint recognition, making speaker labels consistent across different meetings without cloud dependency.
Note-taking with AI actions: Supports creating, organizing into folders, and searching notes via semantic search and cloud synchronization, with the ability to run AI-driven actions on the content.
MCP server: Exposes a Model Context Protocol server to allow other AI systems to manage notes and transcriptions programmatically.

Use Cases:

Developers and writers needing distraction-free input can use the global hotkey to rapidly convert speech into text within any code editor, terminal, or writing application.
Individuals handling sensitive conversations can leverage the local-only pipeline for meeting transcription and dictation, ensuring their recordings and text never leave the device.
AI tooling developers can use the public API and MCP server to integrate OpenWhispr's transcription and note data into custom automation or personal AI assistant workflows.

Open-Source Alternative Value:

OpenWhispr is positioned as an open-source alternative to WisprFlow and Granola, providing a core workflow of voice-to-text dictation and meeting transcription without subscription fees or telemetry. Its value is defined by the ability to operate entirely offline using local AI models like Whisper and Parakeet for transcription and llama.cpp for reasoning, meaning the full feature set—from dictation to speaker-labeled meeting notes—can be accessed with no data leaving the machine. The availability of a public API and an MCP server also allows it to be integrated into broader, user-controlled toolchains, avoiding the integration limitations of closed-source, cloud-only services.

CondividiX LinkedIn Reddit

Strumenti correlati

OpenClaw379,772

Jan43,119

Khoj35,229

Statistiche progetto

Stelle

2,781

Fork

386

Licenza

MIT

Metadati

Alternativa a: Wispr Flow
Categoria: AI Personal Assistants