I Ported Pixal3D over to Apple Silicon

Chillaid Blog

TL;DR : I ported Pixal3D inference pipeline over to Apple Silicon. Here's the link: https://github.com/pawel-mazurkiewicz/Pixal3D-mac/tree/master

The beginning (it sounded easy)

I've discovered Pixal3D when casually browsing HuggingFace about two days after it had released (it actually released on my birthday, May 12 :D). It looked exciting as hell! Last 3D model generator that I tried and was local was actually TRELLIS2 in its ported form by shivampkumar (https://github.com/shivampkumar/trellis-mac). Unfortunately I quickly found out outputs to be disappointing - alpha blending was off - the mesh was translucent at certain angles, there were some missing faces (we'll hear more about that) and the overall quality was off. The other one I'd tried would be Hunyuan3D 2.1 (they sadly stopped releasing weights after 2.1), but it didn't work for me at all in ComfyUI on M5 Max and it was also old news by then and as we know the world of AI is moving very fast. I am also very interested in the 3D model generation - you see, I'm developing a game. A game that I consciously decided to be built a lot with AI help in various ways, an experiment of sorts. So this is acutely relevant to my interests.

Pixal3D is gloriously modern image-to-3D models - you hand it a single PNG of, say, a fairy house, and it hands you back a textured GLB you can spin around in any glTF viewer. It was released early May by TencentARC and is currently one of the best 3D model AI generators out there - at least when it comes to open weights. It descends from the TRELLIS family (although started out on Direct3D-S2), and like most of that lineage it was born, raised, and lovingly optimized on NVIDIA CUDA . Of course I was very disappointed to see that Pixal3D was CUDA dependant. And CUDA dependant hard - it requires nvidiffrast, cumesh, libnatten (compiled for specific NVidia arch), cubvh - basically full stack was for GPUs that understood CUDA. This made me slightly upset, but I thought to myself - why shouldn't I try to get it working on my Mac? I mean, everything I needed seemingly was open source and because of that TRELLIS2 port Shivam Kumar did and the work of Pedro Naugusto, who seemingly ported all of these hard CUDA requirements to Metal, armed with AI this looked like a fun, one or two evenings project. Oh boy was I mistaken. It bears noting, that I was very ignorant of the internals of computer graphics, the math behind it and also the Python + CUDA ecosystem. Basically a clueless fool. But I was very determined to get this working on my computer. It only made sense! When you don't want to pay yet another service for their "credits" it turns out it's a very effective motivation. I lost some sleep, but I didn't lose my drive. I have even entertained an idea of accelerating things with Apple Neural Engine (ANE) and CoreML

The plan was simple though: 1. Swap cuda for mps in a few .to(device) calls.

2. Maybe recompile a couple of custom CUDA kernels into Metal.

3. Generate a fairy house.

4. Brag on the internet.

I mean, I've been running local LLMs for years at this point, how hard could that be, right? ...RIGHT?

Reader, to get to step 4 it took me two weeks of obsessive work. Act 0: the texturing pipeline I lovingly built and then threw in the bin The thing is that porting this deep generative pipeline was almost good soon enough with that plan. You know how they say that the 80% takes 20% of the time but that 20% takes 80% of the time? Yeah, it was like that here. Same model weights (byte-identical). Same seed. Same input image. And yet the Mac output looked... wrong. Shrunken. Desaturated. Geometry shredded, details lost. At first also without uv unwrapping and textures. Like someone had run that fairy house through a dryer on the hot cycle and then attacked it with a cheese grater. Before any of the cursed-kernel detective work, there was a more innocent problem: the Mac couldn't texture a mesh at all. The function that turns a raw mesh + voxel colors into a finished, UV-unwrapped, textured GLB - o_voxel.postprocess.to_glb - is CUDA-only. No CUDA, no texture. So the very first job wasn't fixing texturing; it was *inventing* it from parts I already had on the machine. It was kind of a blast. The plan: skip the unportable native stack entirely and assemble a texturing pipeline out of Python, Blender, and sheer optimism. MacGyver kind of stuff - duct taping things until they worked. Get a mesh out at all - a CPU fallback for the CUDA-only mesh extractor. It worked, but left a constellatio n of pinprick holes wherever the iso-surface couldn't close a cell.

Fill the holes - easy, right? Blender has a fill-holes operator. Except the GUI op routes through the undo stack and hung indefinitely on a 2.7M-face mesh. Dropped down to bmesh.ops.holes_fill directly. Fine. Moving on.

UV unwrap - reach for xatlas, the obvious tool, which actually is a part of the original stack. xatlas took one look at a Pixal3D-density mesh...

I Ported Pixal3D over to Apple Silicon

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

Naphtha Shortages Having a Growing Impact in Japan