Stretching a point: the economics of elastic infrastructure

On this page

What is autoscaling? The timeline of commercial computing can be roughly summarized as: buying computers → renting computers → renting partial computers (virtual servers) → renting partial computers for very short periods (cloud computing). The model that you are used to if you joined the tech industry after, say, 2010, is that you can request a virtual machine (VM) at any time, it will be available almost instantly, can be disposed of equally quickly, and will be billed by the second only when it was actually around. The idea many people had in this context is simple and obvious: don’t pay for capacity you’re not using. If your application sees significantly less use at certain times – say, an online retail business only active in a single country when it is the middle of the night in that country – then reduce the provisioned compute capacity for the application accordingly. Ideally, this would happen without human intervention, automatically. Economics of public cloud At a high level, running a business is a simple calculation: you provide a product or service, you incur certain costs in making that product or providing that service, and you try to charge your customers slightly more than that and call the difference your profit. (I assume my economics Nobel will be in the mail.) For the discussion in the rest of this article, it will be useful to have a basic understanding of the inherent costs and pricing models of public cloud computing. The cost of cloud The basic cost factors for running a public cloud are: You need to buy a number of computers. You need somewhere to put these computers: a datacenter building, and the land that it is on. The computers are going to need electricity to run. You’re going to need high bandwidth internet connectivity. You need some staff to install, maintain, operate and secure the computers. A few additional thoughts on these. Computers are mainly a capital expense, but one with a relatively short useful service life. So if you spend, say, $20k per machine and assume a five year service life, you can calculate with $4k/year for having one machine. It may have occurred to you that I forgot cooling, but I just don’t think it matters much for this discussion. You can cool computers anywhere – if you happen to build in a place that is particularly bad for cooling, you will just use more electricity to run heat pumps. But overall, that’s a constant factor of at most 2 or 3× and doesn’t affect any of the following arguments. (If you’re actually good at running a public cloud, the factor could be as low as 1.02×, giving you a substantial profitability bump.) For similar reasons, we can mostly ignore electricity use by the equipment itself. Modern machines do have power management and use less electricity when not being used at full capacity, but the difference between actual power draw and theoretical max power draw tends to be a fairly small and mostly constant factor. So you will end up with, it costs approximately $x/year to house, power, and cool one machine in this location, and add this to your cost for the machine itself existing. At cloud provider scale, nobody is interested in selling you internet by the gigabyte: you’ll be paying a certain amount of money a year per gigabit/second of potential bandwidth that exists, whether you use it or not. The interesting thing about staffing costs is that it is highly sublinear. As companies acquire more computers, they invest in better automation, so you will find that, when comparing a small corporate datacenter with a thousand machines and a public cloud datacenter with one hundred thousand machines, the cloud datacenter does not need 100× as many employees to maintain 100× as many machines. There is an ongoing theme throughout this section, which I hope you’ve picked up on: as a cloud provider, your costs are dominated by cloud capacity existing and barely influenced by how much capacity you manage to sell. If your customers turn some of their machines off overnight, those machines are all still in the rack, burning money. The price of cloud The model users are most familiar with is on-demand pricing, also known as pay for what you use. For each cloud resource, a usage-based price is listed, for example on AWS, a 16-core Graviton2 virtual machine in Oregon is $0.616000 for each hour it exists. That seems cheap enough if you only need the VM for an hour, but if you do some basic math and figure out how much that is for a month, or a year, or over the five-year lifespan of a physical machine, you’ll find it’s actually really expensive. Why is that? From the cloud provider’s point of view, they have the problem that when you’re not renting the machine, and even when nobody is renting the machine, it’s still there and they’re still incurring most of the cost of the machine existing. They need to adjust their pricing accordingly, so while the headline might be “You only pay for what you use”, the subtext is: “…at a...

Stretching a point: the economics of elastic infrastructure

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs