Overview:
Jarvis is an open-source, local-capable voice dictation application. It allows users to hold a key, speak, and have their speech converted into clean, punctuated text that can be inserted into any application. It serves as a free alternative to subscription-based dictation software, designed for anyone who wants efficient, private voice-to-text functionality without monthly fees or telemetry. The project is built for macOS (including Apple Silicon and Intel), with an iOS version in TestFlight.
Core Features:
Voice-to-text dictation: Hold the
Fnkey, speak, and release to have speech transcribed and formatted as clean text with proper punctuation.Automatic filler word removal: The application automatically removes filler words like "um" and "like" from the output.
AI Post-processing: Utilizes local or cloud AI models to fix grammar, rephrase text, generate bullet points, or create new text based on voice input.
Fully offline mode: Supports local transcription via Whisper or NVIDIA Parakeet (via Sherpa-ONNX) and local LLMs for AI processing via Ollama.
Customizable Prompt Engineering: Every aspect of how the AI assistant behaves is configurable, allowing users to tailor output for specific tasks like email formatting or dictation style.
Zero telemetry: The application includes no tracking or analytics.
Use Cases:
Individuals seeking a free dictation tool: Anyone looking to replace paid voice dictation services can use Jarvis without a subscription.
Privacy-conscious users: Those who prefer to keep their speech data local can use the fully offline mode with local Whisper and Ollama models.
Developers and writers: Users who need to quickly transcribe spoken thoughts, generate formatted text, or create bullet points without manual typing.
Mac users: The application is built for macOS and works system-wide, allowing text to be inserted into any application.
Why It Matters:
Jarvis is a fully open-source (MIT licensed) and free alternative to venture-backed dictation apps. It offers both local and cloud processing options, giving users control over their data and speed preferences. The ability to run entirely offline with local transcription and LLMs, combined with zero telemetry and customizable prompt engineering, makes it a privacy-focused and highly configurable tool for personal voice-to-text needs.




