Fly.io is getting rid of GPU-accelerated machines

Fly GPUs · Fly Docs

title: Fly GPUs layout: docs nav: firecracker toc: false

**GPUs are deprecated and will be unavailable after August 1.**

Fly.io has GPUs! If you have workloads that would benefit from GPU acceleration, Fly GPU Machines may be for you.

## What can I use Fly GPUs for?

Four models of GPU are available: A10, L40S, NVIDIA A100 40G PCIe and A100 80G SXM.

A100 units are all about the tensor cores, and are positioned for inference, model training, and intensive high-precision computation tasks like scientific simulations. As their names suggest, they have 40GB and 80GB of GPU memory. ([A100 datasheet](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-nvidia-us-2188504-web.pdf+external))

L40S cards are all-rounders; they've got tensor cores, RT cores, and NVENC/NVDEC, and have 48GB of GPU RAM. Choose the L40S to accelerate graphics or video workloads, as well as for inference. ([L40S datasheet](https://resources.nvidia.com/en-us-l40s/l40s-datasheet-28413+external))

A10 cards are all-arounders with less GPU RAM. They've got tensor cores, shader cores, NVENC/NVDEC, and can run Llama 3 8B at float16 without breaking the bank. Choose the A10 when you don't need more than 8 billion parameters. This works great for smaller large language models, Stable Diffusion, and other such workflows. ([A10 datasheet](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a10/pdf/a10-datasheet.pdf+external))

Right now each Fly GPU Machine uses a single full GPU. A single GPU is well suited to rendering, encoding/decoding, inference, and a smidgen of fine tuning. Training large models from scratch requires much, much beefier resources.

Go to the [GPU Quickstart](https://fly.io/docs/gpus/gpu-quickstart/) to get off the ground fast, or read more practicalities in [Getting started with Fly GPUs](/docs/gpus/getting-started-gpus/).

## Regions with GPUs

Currently GPUs are available in the following regions:

- `a10`: `ord` - `l40s`: `ord` - `a100-40gb`: `ord` - `a100-80gb`: `iad`, `sjc`, `syd`, `ams`

## Examples

Here's some more inspiration for your GPU Machines project:

- [Python GPU Dev Machine](/docs/gpus/python-gpu-example/) - [Elixir Llama2-13b on Fly.io GPUs](https://gist.github.com/chrismccord/59a5e81f144a4dfb4bf0a8c3f2673131) - [Fly.io CUDA example](https://gist.github.com/dangra/f8123001fe0f2453a8cd638b89738465) - [Deploying CLIP on Fly.io](https://gist.github.com/simonw/52c7734e34cac2b26ea1378845674edc) - [GitHub `fly-apps` repos with the `gpu` topic](https://github.com/orgs/fly-apps/repositories?q=topic%3Agpu)

*)]:mx-auto [body_:where(&>*)]:max-w-2xl [body:not(.toc)_:where(&>*)]:lg:mx-[calc(50%-min(50%,35rem))] [body_:where(&>*)]:lg:max-w-3xl min-w-0 relative">

Fly GPUs

GPUs are deprecated and will be unavailable after August 1.

Fly.io has GPUs! If you have workloads that would benefit from GPU acceleration, Fly GPU Machines may be for you.

What can I use Fly GPUs for?

Four models of GPU are available: A10, L40S, NVIDIA A100 40G PCIe and A100 80G SXM.

L40S cards are all-rounders; they’ve got tensor cores, RT cores, and NVENC/NVDEC, and have 48GB of GPU RAM. Choose the L40S to accelerate graphics or video workloads, as well as for inference. (L40S datasheet)

A10 cards are all-arounders with less GPU RAM. They’ve got tensor cores, shader cores, NVENC/NVDEC, and can run Llama 3 8B at float16 without breaking the bank. Choose the A10 when you don’t need more than 8 billion parameters. This works great for smaller large language models, Stable Diffusion, and other such workflows. (A10 datasheet)

Go to the GPU Quickstart to get off the ground fast, or read more practicalities in Getting started with Fly GPUs.

Regions with GPUs

Currently GPUs are available in the following regions:

a10: ord

l40s: ord

a100-40gb: ord

a100-80gb: iad, sjc, syd, ams

Examples

Here’s some more inspiration for your GPU Machines project:

Python GPU Dev Machine

Elixir Llama2-13b on Fly.io GPUs

Fly.io CUDA example

Deploying CLIP on Fly.io

GitHub fly-apps repos with the gpu topic

Copy page as markdown

Open in ChatGPT

Report an issue or

edit this page on GitHub

Fly.io is getting rid of GPU-accelerated machines

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast