Christmas Outage to #1 App Store Ranking: An Aura Frames Postgres Scaling Retro

andatki2 pts0 comments

From Christmas Outage to #1 App Store Ranking: An Aura Frames Postgres Scaling Retrospective | Software Engineer, Author, High Performance PostgreSQL for Rails

What’s Aura Frames?<br>Disclosures<br>What causes the sharp increase in traffic?<br>Scaling over the years<br>Postgres Scaling Challenges and Solutions<br>Christmas 2024 Retrospective<br>Postgres Christmas 2025<br>Workload-driven “Whole table sharding”<br>Postgres and Infra Metrics Christmas 2025<br>Switchover to New DBs<br>Reflecting back on the plan<br>Thank You and Looking Forward

📌 Overview

On Christmas Day 2024, Postgres infrastructure powering the Aura Frames API had problems under peak load, being unavailable for three hours and disrupting the experience for new customers. The team knew it would need improvements to handle the surge for Christmas 2025 and beyond.

One year later, much of the resource intensive data access was reworked, the Postgres infrastructure was upsized, and this approach not only survived, but thrived, providing reliable service through the holiday season.

The sum of Transactions Per Second (TPS) across the DBs peaked at 226,000, with more than 100K TPS sustained for 10 hours and repeating on multiple days after Christmas, with an average query time of 25 microseconds.

The improved reliability meant customers could smoothly set up new frames and add photos, and they did it more than ever, with the Aura Frames app reaching #1 in U.S. and Canadian Apple and Android App Stores on Christmas Day.

In this post we’ll look back at the months of planning and execution that went into achieving that outcome!

A second post in this series will dig into the Ruby on Rails side, while this one will focus on Postgres.

What’s Aura Frames?

Aura Frames (Aura Home, Inc.) is the company behind modern, high-quality, Wi-Fi connected digital photo frames that customers love.

The frames are easy to use via free iOS and Android apps, don’t require a subscription, and offer unlimited cloud storage for photos and videos. Once set up, family members can be invited to contribute photos and videos via the app from anywhere. Typically Aura frames have an average of 4 contributors adding content.

In 2025, more than 1 billion photos were shared to Aura frames globally.

While public engineering blog posts are limited, Aura was featured on the AWS Storage Blog in the past. Link: How Aura improves database performance using Amazon S3 Express One Zone for caching.

Disclosures

I began working with Aura in 2025. Aura does not have a public engineering blog, so we discussed me writing a post here where I regularly write about Postgres, Ruby on Rails, and scaling databases.

This post was written by me and I do not speak for the company. The company had the opportunity to review and make minor edits before publication.

The Christmas Day outage was a painful reality of scaling fast, and I appreciate Aura’s willingness to discuss it here.

I’m biased, but from my view the company is dedicated to continually improving the customer experience, in part with strategic investments in technical infrastructure.

With that covered, let’s take a look at how the frames are used and what drives the traffic.

What causes the sharp increase in traffic?

On Christmas Day, millions of customers set up hundreds of thousands of new Aura frames. The backend platform needs to work well for both existing customers and handle the load from new customer activity. For new customers it’s especially critical they have a good experience from their first moments with the product.

While the holiday timing is predictable, the rate of new frames and new photos added each year increases, adding a new amount of pressure to infrastructure components. Postgres is not easily horizontally scalable, and is costly to operate.

The average amount of increased peak TPS for all DBs on Christmas Day was ~4.5x, with the biggest being ~18x the normal value. To meet this demand, advanced financial planning and vertical scaling were needed. Resources were all shrunk back down after to save on costs.

Scaling over the years

The team has executed a variety of scaling tactics over the last half decade by employees and in conjunction with Postgres consultants. Scaling efforts often focused on reducing pressure on Postgres within the constraints of a single primary instance, while preserving its operational simplicity. (See: Squeeze the hell out of the system you have for a similar philosophy).

Scaling is more straightforward on the stateless, HTTP side. Aura uses AWS and has leveraged Auto Scaling Groups (ASGs), which can scale up to thousands of EC2 instances running the web application stack, image processing, PgBouncer, and other services.

For Postgres, vertical scaling of a single primary instance was leveraged as long as possible.

Here’s a look at the primary database instance at peak for Christmas Day 2024. Note that the db.r6g.48xlarge instance was the largest instance available for RDS.

Postgres Version<br>RDS...

aura frames postgres christmas scaling customers

Related Articles