Overview:
Hatchet is an open-source orchestration engine designed for background tasks, AI agents, and durable workflows. It provides a platform for queuing, automatic retries, durability, real-time monitoring, alerting, and logging, supporting applications written in Python, TypeScript, Go, and Ruby. The platform can be used as a service (Hatchet Cloud) or self-hosted. It is intended for systems where correctness, reliability, horizontal scalability, and observability are essential, such as those involving long-running workflows, data pipelines, or event-driven architectures.
Core Features:
Durable Tasks: Build fault-tolerant, long-running workflows that can recover from failure, serving as a drop-in replacement for Temporal or DBOS workflows.
Task Queues and Routing: Supports fire-and-forget and fire-and-wait tasks, with configurable retry policies, cron jobs, scheduled runs, and strict routing based on worker labels or weighted scheduling via worker affinity.
Scaling Controls: Includes priority scheduling, rate limiting (including dynamic per-user rate limits), fair scheduling via concurrency policies, and worker slot management to prevent overload.
Observability and Management: Offers a real-time web UI with alerting, monitoring, and logging; supports OpenTelemetry and Prometheus metrics; is multi-tenant by default with users and roles.
Event and Webhook Triggering: Enables event-based triggering and listeners for building highly distributed systems, plus webhook-based triggering from upstream data sources.
DAG Workflows: Supports DAGs for building data pipelines and simple workflows, with an option to choose between DAGs and durable tasks.
Use Cases:
Developers building high-volume applications: Orchestrate background tasks, such as processing jobs or handling event-driven workflows, with rate limiting and priority controls.
Teams managing AI agents or long-running processes: Run durable workflows that must survive failures and provide full execution history for debugging and replay.
System administrators self-hosting task orchestration: Deploy a platform using Postgres as a durability layer for both the task runtime and observability, simplifying self-hosting.
Data engineers building data pipelines: Construct DAGs for data workflows with higher throughput requirements (over 100 tasks per second) than traditional DAG-based platforms.
Why It Matters:
Hatchet combines features of durable execution platforms, task queues, and DAG-based orchestrators into a single, self-hostable system backed by Postgres. Unlike traditional task queues (e.g., Celery, BullMQ) that trade durability for throughput, Hatchet persists all execution history, enabling easy monitoring, debugging, and durable task recovery. It also provides multi-tenancy and role-based access out of the box, making it suitable for teams needing centralized async processing without relying on cloud-only services.




