dstack

Open-source container orchestrator designed for AI teams. Manage GPU workloads, dev environments, and model deployments across any cloud or on-prem cluster.

Overview:

dstack is an open-source control plane for provisioning and orchestrating GPU workloads across GPU clouds, Kubernetes, and on-prem clusters. It handles the development, training, and inference lifecycle by automatically managing provisioning, job queuing, auto-scaling, networking, and failure recovery. Compatible with NVIDIA, AMD, Google TPU, and Tenstorrent accelerators, it is designed for developers and AI teams that need to manage compute resources across hybrid environments without manual infrastructure coordination.

Core Features:

Fleet-first UX: Manage cloud and on-prem clusters defined as YAML configurations, supporting SSH-based fleets for on-prem servers.
Dev environments: Provision interactive development environments accessible from a desktop IDE.
Tasks: Schedule and run individual or distributed jobs, including web applications.
Services: Deploy models and web apps with auto-scaling and authorization.
Volumes: Manage and persist storage volumes across runs.
Agent skills: Enable AI agents like Claude, Codex, and Cursor to create fleets and submit workloads via the CLI.

Use Cases:

GPU provisioning and orchestration: Allocate and manage GPU resources across multiple cloud providers, Kubernetes clusters, or on-prem servers.
Training and inference: Run distributed training jobs and serve model inference workloads with auto-scaling.
Interactive development: Spin up dev environments with GPU access, linked to a desktop IDE for iterative work.
Managed web app deployment: Deploy web applications or model services with built-in authorization and scaling.

Why It Matters:

dstack provides a single control plane abstraction over diverse GPU infrastructure, reducing the operational overhead of managing resources across cloud and on-prem environments. By supporting YAML-defined configurations and offering agent skills for AI-driven orchestration, it enables developers to integrate GPU resource management directly into their workflow. Its compatibility with multiple accelerator types and explicit support for both provisioning and job scheduling makes it a practical open-source option for teams seeking infrastructure automation without vendor-specific tooling.

分享X LinkedIn Reddit

项目数据

Stars

2,126

Forks

223

许可证

MPL-2.0

元数据

替代对象: Kubernetes
分类: Container Orchestration