The AWS Service Quotas That Will Take Down Your Production at 3 Am

The AWS Service Quotas That Will Take Down Your Production at 3 AM (And You Cannot Raise Them Fast Enough) | by Illya Yalovoy | May, 2026 | MediumSitemapOpen in appSign up Sign in

Medium Logo

Get app Write

Member-only story

The AWS Service Quotas That Will Take Down Your Production at 3 AM (And You Cannot Raise Them Fast Enough)

Illya Yalovoy

13 min read· Just now

Listen

Hard limits, scaling lags, and the architectural walls that no support ticket can fix. Your pager fires at 3:12 AM. Users are getting 503s. You check the dashboard — Lambda is throttling at 1,000 concurrent executions. You open a quota increase request and see the estimated response time: 1–3 business days. It is Saturday. This is the moment you learn the difference between a soft limit and a hard wall. The Two Categories of AWS Limits That Matter Every AWS account ships with two kinds of limits, and the difference between them is the difference between a minor inconvenience and a 3 AM architecture redesign. The first kind is adjustable quotas. Lambda concurrent executions (default 1,000 per region), EC2 on-demand vCPUs per instance family, API Gateway requests per second. You can raise these through the Service Quotas console or a support ticket. This sounds reassuring until you learn the timeline: most increases take 1–3 business days. Some, like GPU instances or dedicated hosts, take weeks. If your traffic spike is happening right now, a support ticket is not a solution. It is a post-mortem action item. The second kind is hard limits. NAT Gateway caps at 55,000 simultaneous connections per destination. S3 gives you 5,500 GET and 3,500 PUT requests per second per prefix. DynamoDB on-demand tables will…

Written by Illya Yalovoy 5 followers ·8 following

Help

Status

About

Careers

Press

Blog

Privacy

Rules

Terms

Text to speech

The AWS Service Quotas That Will Take Down Your Production at 3 Am

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits