Foveon – Bayer to Foveon X3, learned, Mac App using deep learning

Foveon — turn a Bayer photo into a Foveon X3 photo

Foveon

A neural sensor translator. Takes a photo from any Bayer-array camera and renders it as if it were shot on a Sigma DP2 Merrill — the Foveon X3 stacked-sensor look, with the colour and microdetail Foveon is famous for, on hardware you already own.

Under the hood: a modified U-Net with an extra layer injected between the encoder bottleneck and the upsampling decoder. The injected channel carries a one-dimensional encoding of three-layer pixel-stack structure — the B·G·R photodiode column that a Foveon sensor captures and a Bayer sensor can’t. Trained end-to-end against matched Bayer → Merrill scene pairs.

U-Net+1D

Modified U-Net with 3-layer pixel injection at bottleneck

Bayer → X3

Bayer CFA in, Foveon X3 stacked- sensor look out

DP2 Merrill

Trained on matched scene pairs against Sigma DP2 Merrill

⤓ Download Foveon.dmg

macOS 13+ · Apple Silicon

33 MB · signed DMG installer · unverified-developer gatekeeper: right-click Open the first time

Foveon — macOS app. Choose a photo, drag the sliders, save the result.

What it is

Most digital cameras capture colour through a Bayer colour filter array : each photosite sees only one of R, G, or B, and the other two channels are interpolated from the neighbours (demosaiced). It’s efficient, but it costs you. The interpolation introduces colour fringing on sharp edges, smears fine detail, and produces the “digital” micro-contrast that even high-end Bayer cameras can’t fully shake.

The Foveon X3 sensor — Sigma’s now-rare design used in the DP1, DP2, and DP3 Merrill cameras — works the way colour film does. Three photodiode layers are stacked vertically at every single pixel position. The top layer absorbs blue, the middle layer green, the bottom layer red. Every pixel captures the full colour. No interpolation, no demosaicing artefacts, no false detail. The result is the “Foveon look”: extraordinary microdetail and a particular colour rendition — warm, dimensional, almost slide-film — that people build entire camera systems around.

Foveon (the app) is a neural network that learns the mapping between the two. Feed it a JPEG or RAW from a normal Bayer camera (phone, mirrorless, DSLR) and it predicts what the same scene would look like shot on a Foveon X3 sensor. Geometry stays the same; colour, tonality, and micro-detail rendering shift toward the Merrill side of the training distribution.

Bayer vs Foveon — the structural problem

Bayer CFA

One colour per pixel

G ×2

Each photosite captures exactly one colour. The other two channels are guessed from the neighbours. The guess is what creates the “digital” signature.

Foveon X3

Three layers per pixel

red (bottom) green (middle) blue (top) stacked photodiodes, one per pixel

Every pixel records R, G, and B separately at the same location. No interpolation. No false colour. The dimensional quality Merrill shooters chase.

The architecture

The core is a standard convolutional U-Net : an encoder that downsamples the input image into a compact feature bottleneck, paired with a decoder that upsamples back to full resolution, with skip connections at every level so fine spatial detail survives the trip through the bottleneck.

The modification is a single new layer dropped in between the encoder’s final downsampling block and the decoder’s first upsampling block: a 1D pixel-stack injection layer that concatenates a one-dimensional encoding of how colour absorbs through silicon depth on a real Foveon sensor — blue first, then green, then red. The decoder learns to use this prior to reconstruct the kind of inter-channel coupling that real X3 captures exhibit — chroma that’s registered with luminance instead of interpolated against it.

U-Net architecture diagram. Encoder ENC1 (C=64) → ENC2 (C=128) → ENC3 (C=256) → ENC4 (C=512) → Bottleneck (C=1024) → 1D Pixel-Stack Injection (B·G·R depth prior) → DEC4 (C=512) → DEC3 (C=256) → DEC2 (C=128) → DEC1 (C=64) → output, with skip connections from each encoder level to the matching decoder level.

Encoder (blue) downsamples the Bayer input. The 1D injection layer (orange) concatenates the Foveon B·G·R depth prior at the bottleneck. Decoder (purple) upsamples back to full resolution. Skip connections (dashed) carry pre-bottleneck spatial detail across to the matching decoder level — standard U-Net, drawn here for completeness. The novel piece is the orange block.

Why inject at the bottleneck

The encoder has just stripped spatial resolution to focus on semantic content; the decoder is about to reconstruct it back. That’s exactly the moment to inject the prior that says “reconstruct as if the sensor were stacked, not mosaiced.” Inject earlier and the encoder learns to ignore it; inject later and the decoder has already committed to a demosaic-style...

Foveon – Bayer to Foveon X3, learned, Mac App using deep learning

Related Articles

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7