Jarvis

Transform your computer interaction with natural voice commands. Dictate 4x faster than typing, control any app hands-free, and boost productivity instantly.

At a Glance:

Jarvis AI Assistant is a fully open-source, local-first voice dictation and assistant app that lets users speak anywhere and get clean, formatted text instantly using local Whisper, NVIDIA Parakeet, or cloud providers, with full offline capability and zero telemetry.

Overview:

Jarvis AI Assistant is a desktop voice dictation and assistant tool built for users who want fast, private text input anywhere on their system. By holding a single key, users can speak and have their words transcribed with automatic filler word removal, grammar correction, and optional reformatting into bullet points or generated text. The application supports fully local processing through OpenAI Whisper, NVIDIA Parakeet via Sherpa-ONNX, and local LLMs through Ollama, making it capable of running entirely offline. Users can also choose cloud-based transcription from Deepgram paired with Gemini for greater speed. Every interaction behavior can be customized through its prompt engineering capabilities. It is currently available on macOS (Apple Silicon and Intel) and iOS via TestFlight.

Key Decision Points:

Local-first, offline-capable: Supports fully offline transcription using local Whisper models (tiny/base/small) or NVIDIA Parakeet and local LLMs via Ollama, requiring no internet connection once configured.
Hybrid cloud option for speed: Users prioritizing speed can use Deepgram for transcription and Gemini for AI post-processing, with Deepgram requests defaulting to a no-data-logging parameter.
Prompt-level customization: All assistant behaviors, including mail formatting, dictation cleaning, and response generation, are customizable through user-defined prompts.
Cross-platform availability: Currently supports macOS (Apple Silicon and Intel) and iOS through TestFlight; a Windows version is planned but not yet available.
Performance depends on hardware: Local AI post-processing adds 1–3 seconds latency on standard M1/M2 chips, while higher-end M1/M2/M3 Max/Ultra chips provide near-instant results; disabling post-processing yields instant transcription.

Core Features:

Push-to-talk dictation: Hold Fn to record speech and release to insert clean, punctuated text into the active application.
Automatic filler word removal: Strips "um", "like", and other filler words from transcribed speech without manual editing.
AI text post-processing: Applies grammar fixes, rephrasing, bullet-point formatting, or text generation to transcribed speech when enabled.
Local transcription engines: Supports OpenAI Whisper in multiple sizes and NVIDIA Parakeet via Sherpa-ONNX for local, offline transcription.
Ollama local LLM integration: Connects to locally running LLMs through Ollama, auto-detecting available models for fully private AI processing.
Hands-free mode: Toggle continuous listening with a double-tap of the Fn key, eliminating the need to hold the key for each interaction.

Use Cases:

Developers and writers who need fast, private voice dictation in any text field across their macOS desktop without relying on cloud services.
Users with capable Apple Silicon hardware who want completely offline voice-to-text with AI-powered cleanup and reformatting.
Individuals who want customizable assistant behaviors, such as generating emails or bullet-point summaries from dictated notes, using locally running models.

Open-Source Alternative Value:

As an MIT-licensed project, Jarvis AI Assistant offers a fully functional voice dictation tool that can operate entirely on local hardware using its built-in local Whisper and NVIDIA Parakeet support. Its integration with Ollama extends this local-first approach to AI text processing, giving users an option to avoid cloud transcription services completely. The prompt engineering layer allows developers and advanced users to customize every aspect of text output, from formatting rules to assistant behaviors. The project's explicit commitment to no telemetry means no user data leaves the machine unless cloud providers are deliberately configured.

ShareX LinkedIn Reddit

Related tools

OpenClaw377,486

Jan42,933

Khoj34,992

Project stats

Stars

552

Forks

License

MIT

Metadata

Alternative to: Wispr Flow
Category: AI Personal Assistants