80386 Microcode Disassembled

nand2mario2 pts0 comments

80386 microcode disassembled " Reenigne blog

Reenigne blog

Stuff I think about

" 8086 microcode disassembled

80386 microcode disassembled

After I posted 8086 microcode disassembled, Ken Shirriff sent me a high-resolution image of the microcode ROM from the 80386. I didn't expect I would ever do anything with it for a couple of reasons: one is that it's absolutely huge (94720 bits) compared to the 8086 one (10752 bits) so (even with bitract or similar) it would be extremely tedious to transcode and check. The other reason is that I wouldn't know where to start with it - at least with the 8086 there was a patent which gave the general outline and some chunks of code which I could search for. The 80386 was a complete black box. I knew what it did and had a rough idea of how it might work but that turning that into something that I could search for in a big blob of binary seemed like an insurmountable challenge.

Some years later, I was talking to GloriousCow and Smartest Blob (possibly amongst others) on Discord and they mentioned that it would be interesting to get high resolution images of the 80386 die and try to extract the microcode from it. I mentioned that the first part had already been done but that turning the image into a binary blob and a binary blob into intelligible microcode seemed too hard. Well, they may have taken that as a bit of a challenge - they threw various bits of image processing, AI, and human-aided automation at the problem and a few days later had the binary blob extracted from the image and cross-checked.

Disassembling it was still quite a challenge, though! We found various patterns and gradually figured out how to rearrange it into μ-ops on one axis and μ-op bits on the other. Then on the order in which to read the μ-ops (helped by a block of unused μ-ops at one end). And how to divide up the μ-op bits into fields. From the 8086 microcode work I assumed that two of the fields would be source and destination registers to copy from. I also knew that the 80386 could do an ALU operation in 2 cycles, suggesting that there had to be a field to specify a second input to the ALU in order that the microcode for these operations could load both operands to the ALU in the first cycle and then the output to the destination on the second cycle. There was also a pattern that occurred with some regularity that we suspected might indicate the end of an instruction (we were right).

Ken helped too by tracing various lines and bits of logic on the 80386 die so that we could see how things were connected up. Gradually the picture become clearer. Each time we figured something out it gave a clue as to the meaning of other chunks of microcode that used the same construct. At the same time we were working on decoding the instruction decoder (which consists of multiple smaller PLAs) and the protection test PLA. Eventually we got to the point where we could associate 386 instructions with chunks of microcode, and things became much clearer.

The 80386 is much faster on a per-cycle basis than the 8086 for most instructions, a feat which it achieves by throwing a lot more transistors at the problem - many algorithms which are implemented by microcode in the 8086 are essentially "hardware accelerated" in the 80386 so I realised early on that more of the 80386 microcode would be setting up these accelerators instead of embodying algorithms directly. Figuring out the interfaces between the accelerators (like the multiply and divide hardware, the barrel shifter, and the protection test unit) and the microcode was a lot of the work.

How many different instructions does the 80386 have, according to the microcode? What are they?

The microcode has 215 entry points from the decoding ROM - quite an increase over the 60 of the 8086! Part of this is new instructions, and part is that instructions are handled by different routines depending on such things as whether their operands are registers or memory, whether the CPU is in real or protected mode, and whether REP prefixes are in operation. I won't list them all here but you can find them in the fields.txt file if you're interested (along with all the subroutines and shared code). It's not very meaningful to list the top-level microcode routine size since many of them do a small amount of work and them jump to a routine shared with another entry point. It's also not meaningful to list the number of opcodes each entry point handles, as the instruction decoder uses more than just the opcode to determine which routine to use.

Are there any instructions not handled by the microcode?

Surprisingly, no! Unlike the 8086 (and also unlike modern CPUs), the 80386 is always executing a μ-op and there is microcode for every instruction.

Does the microcode contain any "junk code" that doesn't do anything?

The routine from 0x849 to 0x856 inclusive (marked as "unused?" in the microcode disassembly) doesn't seem to have any entry points associated with it. I'm not...

microcode from bits instructions disassembled blob

Related Articles