screenpipe

Open source AI-powered screen recorder that captures screen and audio 24/7. Search your digital history with natural language, get AI assistance, and automate workflows. 100% local and private.

At a Glance:

screenpipe is an open-source, local-first application that continuously captures your computer screen and audio into a searchable, AI-powered memory, serving as a leading alternative to Rewind.ai, Microsoft Recall, and Granola.

Overview:

screenpipe is an open-source personal AI memory tool that records your screen and audio 24/7 to create a private, searchable timeline of everything you do on your computer. It captures screen content using an event-driven engine that prioritizes accessibility tree data over resource-intensive OCR, alongside system and microphone audio transcribed locally by Whisper. Users can search their history with natural language, use AI coding assistants to query their context via an MCP server, or run scheduled AI agents called "Pipes" for automation. All data is stored in a local SQLite database by default, with optional encrypted cloud sync available. It is designed for knowledge workers, developers, and anyone seeking a self-hosted alternative to cloud-based memory tools, with specific support for those with ADHD.

Key Decision Points:

Deployment & Data Control: Operates 100% locally by default, storing all captured data in a local SQLite database. An optional end-to-end encrypted cloud sync is available for a Pro subscription, offering a choice between complete offline privacy and multi-device access.
Capture Method for CPU/Storage Efficiency: Uses event-driven capture triggered by meaningful OS events (app switches, clicks) and prioritizes the accessibility tree for text extraction to keep CPU usage at 5-10% and storage at ~5-10 GB/month, which is more efficient than continuous recording.
Extensibility and Agentic Automation: Features a unique "Pipes" plugin system where scheduled AI agents are defined as markdown files, with YAML-configurable, deterministic data permissions enforced at the OS level—not through prompts—to control what data the agent can access.
Platform and Multi-Monitor Support: Provides full support for macOS, Windows, and Linux, and captures all connected monitors simultaneously, unlike some alternatives that are OS-specific or limited to the active window.
AI Model Choice and Integration: Supports local models via Ollama for complete privacy, as well as cloud models, and integrates with AI coding assistants like Cursor and Claude Desktop through a native MCP server for querying screen history.

Core Features:

Event-Driven Capture: Captures screenshots paired with the accessibility tree only on meaningful OS events (app switches, clicks, typing pauses) to minimize resource usage, falling back to OCR when accessibility data is unavailable.
Local Audio Transcription: Performs real-time speech-to-text on captured system and microphone audio using OpenAI Whisper running locally, with support for speaker diarization and app-specific audio exclusion on macOS 14.4+.
AI-Powered Search: Provides natural language and semantic search across captured screen text and audio transcriptions, with filters for application name, window title, browser URL, and date range.
Plugin System (Pipes): Enables users to create scheduled AI agents from markdown files that can query screen data, call APIs, and write files, with built-in YAML frontmatter for deterministically gating access to specific apps, content types, and time ranges.
MCP Server Integration: Runs as a Model Context Protocol server, allowing compatible AI assistants like Claude Desktop and Cline to directly query the user's screen history and meeting transcriptions.
Developer API: Offers a full local REST API on port 3030 with endpoints for searching screen content, audio, and frames, plus raw SQL access to the underlying database and a JavaScript/TypeScript SDK.

Use Cases:

Knowledge Workers and ADHD Users: Allows individuals to recall any piece of information they've seen or heard on their computer, such as a past document, a specific tab, or a detail from a meeting, by searching their personal timeline like a DVR.
Developers Augmenting AI Coding Assistants: Provides context to tools like Cursor or Claude Code about recent on-screen work, such as error messages, documentation, or application state, via the MCP server.
Meeting Transcription and Auditing: Automatically transcribes meetings from any application (Zoom, Google Meet, Teams) and provides a searchable record of conversations with speaker identification.
Automating Workflows with Scheduled AI Agents: Developers can configure "Pipes" to run on a schedule, for example, to automatically sync daily activity logs to an Obsidian vault or scan browsing history for to-dos.

Open-Source Alternative Value:

As an MIT-licensed project, screenpipe provides a transparent and inspectable codebase for a category of tool dominated by proprietary systems like Rewind.ai and Microsoft Recall. Its core architecture ensures data stays local by default, giving users direct control over their database without mandatory cloud uploads. The project's value is further defined by its flexible AI integration, allowing users to choose local models via Ollama for a fully offline privacy setup, and its unique plugin system that provides a programmable layer on top of personal data with cryptographically enforced permissions, moving beyond basic screen recording.

分享X LinkedIn Reddit