GPU Infrastructure On Cycle
As developers build increasingly complex and resource-intensive applications on Cycle, it became clear that enabling organizations to leverage GPUs — especially for machine learning (ML) workloads — was essential.
Whether utilizing virtual machines or bare metal, Cycle fully abstracts the underlying infrastructure, providing a single standardized path to deployment. Additionally, this integration supports applications compiled with both GCC
(Ubuntu, CentOS, etc.) and musl
(Alpine), giving developers the flexibility to continue using the technologies they know and love.
Startup Tier and Above
GPU powered infrastructure is available to hubs on the Standard
tier and
above. For more information about the different tiers head over to our
pricing page or reach out to our technical
support
Deploying GPU Enabled Infrastructure.
Deployment of GPU enabled infrastructure is the exact same process as with any other infrastructure type.
- Head to the infrastructure tab on the main navigation
- Click Add Servers
- Select A Provider & Location (currently AWS and Vultr carry GPU boxes)
- Select the Server from the list and deploy that server.
Google Cloud Platform Quotas
For users looking to deploy GPU infrastructure from GCP, you will need to
request an increase to the GPUS_ALL_REGIONS
and NVIDIA_A100_GPUS
settings placed on your account if you have not done so already. As of
writing this, those amounts are defaulted to 0 and will prevent you from
being able to deploy GPU powered infrastructure.
Containers Needing GPU Resources
When GPU powered images are uploaded to Cycle, the platform will look for special environment variables in the image and then post a badge thats says NVIDIA GPU
next to the image that signifies the need a GPU server.
The environment variables that the platform looks for are:
- NVIDIA_GPU
- NVIDIA_REQUIRE_CUDA
- CUDA_VERSION
This information is also used by the platform when deciding what servers are qualified as a potential target for a given image. For example, an image with NVIDIA GPU
tag will only be able to be deployed to a server with GPU resources and not to a server that only has CPU resources.
Deploying A GPU Sample Container
After provisioning GPU-powered infrastructure, GPU workloads can be deployed as soon as the server is online. To streamline this process, Cycle provides a "Sample Container" with the NVIDIA GPU samples repository pre-compiled and ready for execution.
To use this container, create a new DockerHub image source type with the image name: cycleplatform/gpu
and the tag samples-cuda
.
Next, move to an environment (or create a new one) that has access to a cluster with a GPU powered server. Use the deploy containers form to deploy a stateless copy of this image, and start the container.
First Starts
The first time this container is started can take up to a few minutes. This is due to the fact that the container image is quite large and must be copied to the server before it can be started.
After the container starts, navigate to the Instances tab and use the two-way console to connect to the container. Once connected, locate the release
directory, which contains a compiled copy of every available sample from the NVIDIA repository. Note that not all samples are designed to work on every machine due to the lack of a screen attached to the servers. The most useful binary to run is deviceQuery
, which provides detailed information about the GPU in use. Explore some of the other binaries to see different outputs.
Its also possible to run the NVIDIA System Management Interface nvidia-smi
command and all (most) subcommands associated with it. More information about nvidia-smi can be found here .