Designing Software for Software Factories
Shrivu’s Substack
SubscribeSign in
Designing Software for Software Factories<br>Reflections on turning software engineering into agentic loops.
Shrivu Shankar<br>Jun 13, 2026
Share
Article voiceover<br>0:00
-17:12
Audio playback is not supported on your browser. Please upgrade.Since my AI-powered Software Engineering (2024) post, the overton window has shifted from whether AI can even aid software development in any way to what parts of it even remain human. At work, we’ve been building out what I call our “software factory”, and this post I wanted to chat through how it’s come together, the hardest parts we’ve run into, and what actually works.<br>What is a software factory?
While it's becoming a bit of a buzzword with different definitions depending on who you ask — I define it as an AI-driven system and the organization that surrounds it that solutions, designs, builds, tests, and deploys software products. If you've read The Transposed Organization, a "software factory" is just one of many 'loops' a modern AI native company must develop as a core part of EPD (eng, product, design) operations.
The software factory absorbs the raw, unscoped stream of customer requests and resolves it into shipped software — with the human entering (For now) at exactly one point. Style inspired by background-agents.com.<br>IMO a full Software Factory must:<br>Be able to operate on the raw distribution of customer generated RFEs and bug reports as input. It does not count if a PM needs to scope every ticket or an engineer needs to break the solution into smaller pieces.
Only require humans for off-ramping (pressing the big red stop button) and review at certain stages. It does not count if there’s an explicit “pairing” step anywhere in the loop or if the system runs on an individual’s laptop.
Feed every review back into the system such that the review gate deprecates itself over time. It does not count (or work well) if reviews only apply to a single instance of software generation.
Be able to run many requests through the loop concurrently, not one ticket at a time. It does not count if requests must be serialized because stages share mutable state (one test/staging env, one branch, one deploy slot) or if throughput is capped by a human-owned resource rather than by spend.
You measure the first order1 efficacy of a software factory typically via:<br>Cycle time — The wall time for customer request to deploy (or per stage).
Review volume — How much feedback is given across stages on AI-drafted outputs (per stage).
Off-ramps — How often a request gives up and falls back to a human-in-the-loop development process (per stage). This can also be reframed as %-factory applicable.
Thanks for reading Shrivu’s Substack! Subscribe for free to receive new posts and support my work.
Subscribe
Seeding a software factory
At the risk of being less useful, I'm going to focus this post on high level tips given you already have some semblance of a software factory setup. Exactly how one works, how end-to-end it goes, whether it's home grown or purchased is really going to vary company by company. I've mostly seen two buckets:<br>AI-native startups<br>Typically have a lot more room to build an AI-friendly tech stack and the contractual and compliance risks are typically lower. The downside is they can't afford to have a dedicated AI dev ops team to build any sort of "software factory platform".
Recommendation : If Claude picked your stack, it's very likely you can actually just find something to buy as like a factory-as-a-service. It's also critical to set the expectation up front: with no central platform team, every engineer is the software-factory architect for the systems they own.
Enterprise software companies (me irl)<br>Moving to AI-friendly stacks becomes quite a large migration and the appetite for risk (often around service stability) is low to none. They do and have started software factory platform like teams.
Recommendation : Today at least, it's likely easier to build than buy and do this via one more dedicated AI-readiness platform teams. A common failure mode is buying something with a low ceiling that actually can't get you to the same compliance and level of testing needed to actually ship a "real" feature. Another option, for those bold enough, is to fork all new products onto an AI-native stack but this only really pays off if it's isolated enough to not share the same compliance and stability risks.
Don’t start a product or platform as a software factory
Another perhaps counterintuitive observation is that software factories work best when there's patterns, contracts, and scaffolding to match against. Pure greenfield projects don't have this and pre-maturely factorifying has led to code-bloat and a reduced ability to understand the project as it evolves. Another risk is just lack of data for how the project evolves and will evolve — a factory works best when the system itself is...