Designing a Better Podcast Editor

asolove1 pts0 comments

Designing a better podcast editor — Adam Solove Adam Solove<br>Blog/UI engineering<br>3 Jun 2026

Designing a better podcast editor<br>A better concept model and more efficient tools for editing spoken word audio.

▶ Contents Contents<br>Audio layout<br>From absolute to magnetic time<br>From splits to skip regions<br>Pin-based alignment<br>Navigating audio<br>Conclusion and what’s next<br>About me

By Adam Solove · Published 3 Jun 2026 · Reading time 9 min<br>For the past few years, my partner has recorded and edited a niche podcast while I’ve helped a bit with selecting music and EQ setting. Our workflow was painful: one shared iCloud file, emails of notes coded to timestamps, carefully checking that our changes didn’t conflict.

Editing audio was like traveling back in time twenty years: no track changes, no comments, and no multiplayer editing.

The friction wasn’t just in the file-oriented workflow. The normal interaction model of Digital Audio Workstations (or DAWs) is not particularly well-suited for the task of editing spoken word audio.

So I decided to build the podcast editor we wanted: Ducking. It has a UI purpose-built for laying out spoken word audio, plus multiplayer editing, collaboration tools, and history management. In this post, I’ll talk about the improvements it makes to editing tools. Future posts will discuss the engineering challenges of multiplayer audio editing and the pleasure of building software for just a few users with design sketching techniques and LLM assistance.

Fig. 1.#

Screenshot of Ducking in use, with the comments and effects panels open.

How does Ducking major editing podcasts easier? It focuses on providing better tools for the two most-common recurring tasks:

Audio layout: specifying how bits of audio should stick together as things around them change.

Navigation: finding the right bit of audio. This happens at a lot of levels of precision, from “roughly where does act 2 start?” down to “exactly which millisecond is the beginning of that background noise?”

Ducking itself was built specifically for our podcast workflow, serves its purpose doing that, and won’t be public anytime soon. But I hope that some of these ideas will spread into other tools and be more broadly useful.

Throughout this post, I’ll show simplified animations of the features in action to avoid distracting with other parts of the editing UI.

Audio layout

Like laying out a newspaper or a webpage, one of the main challenges with audio editing is to start by roughly trying out how different parts fit together, then to carefully specify more precisely, without messing up the existing choices.

Ducking provides an audio layout concept model that is much faster to work with, by borrowing ideas from other DAWs, text editors, and even further afield.

From absolute to magnetic time

In a traditional DAW, every clip has an absolute start time. When one clip is moved or edited for length, everything after drifts out of alignment.

Fig. 2.#

Absolute layout — trimming any clip leaves a silent gap or overlaps the next clip.

Absolute layout is the right model for writing songs, where material in one measure should stay there. But it’s the wrong model for editing spoken word material, where the default is to reflow later material as earlier bits change.

The right layout model is a magnetic timeline, where clips are ordered, not positioned. Each clip’s place in time is computed from the lengths of the items before it. So when one clip is added, removed, or edited, everything after just re-flows automatically.

Gap clips allow adding explicitly-timed silence when that is needed.

Fig. 3.#

Magnetic layout — clips and gaps reflow when you trim.

This is the model used by many video editing tools as well as audio tools that focus on spoken word, like Hindenburg. So the idea itself isn’t new. But it provides the first step and suggests that further playing with the idea of an automated layout model might be useful.

From splits to skip regions

The vast majority of podcast editing is repeatedly removing tiny bits of unwanted material like filler words, long pauses, or a flubbed sentence. In most audio editors, that means splitting each recording clip into lots of tiny parts and adjusting their alignment. After doing that dozens of times, the timeline view becomes a huge set of disconnected clips that are hard to scan or reorganize.

Fig. 4.#

Without skip regions, every filler removal splits a clip in two. One more cut and you're up to ten detached fragments — none of them carrying any indication that they belong to the same original take.

Ducking uses “skip regions” as a better solution. The editor can leave a clip as a single unit while editing away part of it as not to be used. This keeps a single mostly-intact recording as a unit, so it’s easier to understand and rearrange, while still indicating where material has been removed.

Fig. 5.#

Skip regions — fold a portion of a clip without splitting it in two.

The skip region acts like...

audio editing clip layout model podcast

Related Articles