Open-source container orchestrator designed for AI teams. Manage GPU workloads, dev environments, and model deployments across any cloud or on-prem cluster.

At a Glance:

dstack is an open-source unified control plane for GPU provisioning and orchestration that works across any GPU cloud, Kubernetes, and on-prem clusters, streamlining development, training, and inference workflows through a CLI, API, or AI agent skills.

Overview:

dstack is an open-source GPU orchestration control plane designed to provision and manage compute resources across diverse infrastructures, including any GPU cloud provider, Kubernetes clusters, and on-prem servers. It acts as a unified layer for streamlining development, training, and inference tasks. Users interact with dstack by defining fleets, dev environments, tasks, services, and volumes as YAML files within a repository and applying them via its CLI, a programmatic API, or integrated AI agent skills. The platform automates underlying complexities such as provisioning, job queuing, auto-scaling, and networking, making it suitable for developers and teams needing a consistent workflow for hardware-accelerated workloads without being tied to a specific cloud vendor.

Key Decision Points:

  • Unified multi-cloud & on-prem provisioning: dstack works with any GPU cloud, Kubernetes, or on-prem clusters, allowing users to manage diverse compute pools from a single control plane.

  • Workload definition via YAML in-repo: All compute needs, including fleets, dev environments, tasks, and services, are defined as YAML configuration files stored directly in the project repository.

  • Interaction through CLI, API, and AI agents: Users can apply configurations and manage resources using the dstack CLI, a programmatic API, or by installing agent skills that enable AI tools like Claude and Cursor to act on their behalf.

  • Server deployed by the user: Users must launch and configure their own dstack server to connect to their backends, with the server supported on Linux, macOS, and Windows.

  • Online compute broker model: The dstack server must be running for users to provision and orchestrate compute resources; it is not a local-only tool.

Core Features:

  • Fleets: Management of cloud and on-prem computing clusters.

  • Dev environments: Interactive development setups intended for use with a desktop IDE.

  • Tasks: Scheduling of batch jobs, distributed jobs, or running web apps.

  • Services: Deployment of models and web apps with auto-scaling and authorization.

  • Volumes: Management of persistent disk volumes.

  • AI agent skills: A mechanism to allow AI agents like Claude, Codex, and Cursor to manage fleets and submit workloads by editing configuration files and using the CLI.

Use Cases:

  • Developers who need to provision GPUs across different cloud providers and on-prem servers using a single, consistent YAML-based workflow.

  • Practitioners running machine learning training or inference jobs who want to automate provisioning, job queuing, and failure handling across diverse backends.

  • Teams deploying models and web apps with auto-scaling capabilities without managing each cloud provider's native tooling directly.

Open-Source Alternative Value:

dstack provides an open-source option for GPU orchestration that is not limited to a single cloud provider, allowing developers to configure a unified provisioning layer on their own infrastructure. Its value lies in consolidating multi-cloud and on-prem GPU management into a single server and workflow, with the added ability for AI agents to directly manage resources through its skills system. The configuration-as-code model, where all resource definitions live as YAML files in a repo, gives users a transparent and reproducible way to manage their compute environment.

ShareXLinkedInReddit

Project stats

Stars

2,155

Forks

233

License

MPL-2.0

Metadata

Alternative to
Kubernetes