Shoehorning R-Type into the ESP32

rcarmo1 pts0 comments

Shoehorning... R-Type into the ESP32 - Tao of Mac

Rui Carmo

Tao of Mac

Jun 17th 2026 &middot; 7 min read<br>&middot;<br>#agents<br>#ai<br>#arcade<br>#emulation<br>#esp32<br>#hardware<br>#piclaw<br>#retrocomputing<br>#rtype

Shoehorning... R-Type into the ESP32

This is a very quick follow-up to my Mac emulation hacks from a couple of weeks ago, and worth noting for the fun value and a little bit of AI.

I love old arcade games (especially some NeoGeo titles), so it was only natural that I gravitated to them while I was trying to get Mac color rendering to work on an ESP32–if there’s a piece of software that was extremely attuned to its hardware, it’s arcade games, often written to map directly into hardware.

And I love R-Type in particular, so even though I originally thought of getting Metal Slug to run on the ESP32-S3 because of its shared 68000 heritage with the Mac, I ended up wondering how fast I could make that run.

Turns out the M72 boards Irem did for R-Type ran an 8086-like CPU (the NEC V30, which has a few extensions) and a Z80 in tandem, and that the emulator wasn’t at all hard to recompile if you stubbed out things like audio (which is done by the Z80).

The Output, So Far<br>I decided to start with the hardest/smallest target (the plain CYD with a plain ESP32), which can barely run the emulator in one core and has almost no free RAM–to the point where after a few iterations it was rendering something, but clearly wouldn’t make it without rebuilding the whole emulator from scratch.

Getting it to render frames effectively (as in, rendering one frame without any visible stutters inside the frame), is exactly the kind of problem I am having on the Mac emulator because a) you typically need enough RAM to manage the framebuffer and b) all ESP CYD displays have limitations regarding display (typically SPI) bandwidth.

For a little bit of inside baseball (yeah, I’ve been spending time with US folk again) the real hassle (especially on the smaller ESP32) was handling memory maps, palette RAM, tile/sprite priority, and frame timing. You can finagle things a bit by reassigning one of the cores to "just" do rendering, and there are various DMA modes depending on chipset, but all of which proved to be enough distraction for me to upgrade to an S3-powered display as soon as I could.

So I just focused on clean frame renderings, even if the time required to produce them made it feel like a slideshow, so much so that after figuring out the backgrounds were a static texture composited behind the main sprites, I decided to skip that.

It would have been amazing to see running on the smaller one, though.

Then I got piclaw to port the entire thing to the ESP32-S3, and all of a sudden there was enough horsepower to run and render at around 50fps:

Both boards, starting from the same emulator state but rendering as fast as they can

I’m so happy with the results that I am considering getting this to run on an ESP32-P4 and see what we can do about audio and using the USB host port on that for a controller, but I really should focus on backporting the rendering techniques into a Mac emulator…

Either way, this was a great way to refine my approach at getting AI agents to tackle long, grinding, intricate problems, and the code is up on GitHub if anyone cares to check it out.

The Method<br>However, before handing it over to agents, I had to specify how to do this, and right now, after half a dozen embedded development and hardware porting projects since Christmas, the strategy is pretty well established:

Get something to run on a host harness, running VNC, plain SDL or just framebuffer dumps

Derive milestones from that (still quite manual) job. Maybe even more harnesses (like target CPU opcode harnesses for JITs, sprite subroutines, etc.)

Tackle the first few milestones on a simpler (but also more limited) hardware/software target

Build reusable debugging/introspection tools for each milestone that the agents can use later to have a feedback loop

Expand out from the above.

That’s why my first hack for these things is just to point a webcam at the display (or generate a frame, or a known good end-to-end output dump) and get them to render a test pattern:

The M5Stack Tab 5, the highest-end ESP32 device I have, showing a test pattern

From then on, the agents can use the camera and other test patterns to verify that they are rendering correctly (of course it’s useless for video, but any SOTA model these days can take useful feedback from images), and, as a bonus, I get their snapshots on the piclaw web interface and can verify that they are actually doing what I want them to do.

The Harness<br>I already knew what I wanted to achieve (in short, to explore and document techniques to render fast graphics on these boards), and I had a camera pointing at the target devices like in previous hacks, but one of the things I wanted to explore with this setup was to mitigate long context problems:

Even if you use things like /goal (which I do, but with...

esp32 from rendering like emulator type

Related Articles