dstack

Open-source container orchestrator designed for AI teams. Manage GPU workloads, dev environments, and model deployments across any cloud or on-prem cluster.

At a Glance:

dstack is an open-source unified control plane for GPU provisioning and orchestration that works across any GPU cloud, Kubernetes, and on-prem clusters, streamlining development, training, and inference workflows through a CLI, API, or AI agent skills.

Overview:

dstack is an open-source GPU orchestration control plane designed to provision and manage compute resources across diverse infrastructures, including any GPU cloud provider, Kubernetes clusters, and on-prem servers. It acts as a unified layer for streamlining development, training, and inference tasks. Users interact with dstack by defining fleets, dev environments, tasks, services, and volumes as YAML files within a repository and applying them via its CLI, a programmatic API, or integrated AI agent skills. The platform automates underlying complexities such as provisioning, job queuing, auto-scaling, and networking, making it suitable for developers and teams needing a consistent workflow for hardware-accelerated workloads without being tied to a specific cloud vendor.

Key Decision Points:

Unified multi-cloud & on-prem provisioning: dstack works with any GPU cloud, Kubernetes, or on-prem clusters, allowing users to manage diverse compute pools from a single control plane.
Workload definition via YAML in-repo: All compute needs, including fleets, dev environments, tasks, and services, are defined as YAML configuration files stored directly in the project repository.
Interaction through CLI, API, and AI agents: Users can apply configurations and manage resources using the dstack CLI, a programmatic API, or by installing agent skills that enable AI tools like Claude and Cursor to act on their behalf.
Server deployed by the user: Users must launch and configure their own dstack server to connect to their backends, with the server supported on Linux, macOS, and Windows.
Online compute broker model: The dstack server must be running for users to provision and orchestrate compute resources; it is not a local-only tool.

Core Features:

Fleets: Management of cloud and on-prem computing clusters.
Dev environments: Interactive development setups intended for use with a desktop IDE.
Tasks: Scheduling of batch jobs, distributed jobs, or running web apps.
Services: Deployment of models and web apps with auto-scaling and authorization.
Volumes: Management of persistent disk volumes.
AI agent skills: A mechanism to allow AI agents like Claude, Codex, and Cursor to manage fleets and submit workloads by editing configuration files and using the CLI.

Use Cases:

Developers who need to provision GPUs across different cloud providers and on-prem servers using a single, consistent YAML-based workflow.
Practitioners running machine learning training or inference jobs who want to automate provisioning, job queuing, and failure handling across diverse backends.
Teams deploying models and web apps with auto-scaling capabilities without managing each cloud provider's native tooling directly.

Open-Source Alternative Value:

dstack provides an open-source option for GPU orchestration that is not limited to a single cloud provider, allowing developers to configure a unified provisioning layer on their own infrastructure. Its value lies in consolidating multi-cloud and on-prem GPU management into a single server and workflow, with the added ability for AI agents to directly manage resources through its skills system. The configuration-as-code model, where all resource definitions live as YAML files in a repo, gives users a transparent and reproducible way to manage their compute environment.

ShareX LinkedIn Reddit

Project stats

Stars

2,155

Forks

233

License

MPL-2.0

Metadata

Alternative to: Kubernetes
Category: Container Orchestration