What is a microcoded instruction?

I have seen a lot of literature referencing microcoded instructions.

What are these and why they are used?

标签： assembly cpu cpu-architecture

1条回答

2楼-- · 2020-02-05 08:34

A CPU reads machine code and decodes it into internal control signals that send the right data to the right execution units.

Most instructions map to one internal operation, and can be decoded directly. (e.g. on x86, add eax, edx just sends eax and edx to the integer ALU for an ADD operation, and puts the result in eax.)

Some other single instructions do much more work. e.g. x86's rep movs implements memcpy(edi, esi, ecx), and requires the CPU to loop.

When the instruction decoders see an instruction like that, instead of just producing internal control signals directly they read micro-code out of the microcode ROM.

A micro-coded instruction is one that decodes to many internal operations

Modern x86 CPUs always decode x86 instructions to internal micro-operations. In this terminology, it still doesn't count as "micro-coded" even when add [mem], eax decodes to a load from [mem], an ALU ADD operation, and a store back into [mem]. Another example is xchg eax, edx, which decodes to 3 uops on Intel Haswell. And interestingly, not exactly the same kind of uops you'd get from using 3 MOV instructions to do the exchange with a scratch register, because they aren't zero-latency.

On Intel / AMD CPUs, "micro-coded" means the decoders turn on the micro-code sequencer to feed uops from the ROM into the pipeline, instead of producing multiple uops directly.

In current Intel CPUs, the limit on what the decoders can produce directly, without going to micro-code ROM, is 4 uops (fused-domain). AMD similarly has FastPath single or double instructions, and beyond that it's VectorPath or Microcode, as explained in David Kanter's in-depth look at AMD Bulldozer, specifically talking about its decoders.

Another example is x86's integer DIV instruction, which is micro-coded even on modern CPUs like Intel Haswell. See my answer on Why is this C++ code faster than my hand-written assembly for testing the Collatz conjecture? for the numbers.

FP division is also slow, but is decoded to a single uop so it doesn't bottleneck the front-end. If FP division is rare and not part of a latency bottleneck, it can be as cheap as multiplication. (But if execution does have to wait for its result, or bottlenecks on its throughput, it's much slower.)

Integer division and other micro-coded instructions can give the CPU a hard time, and creates effects that make code alignment matter where it wouldn't otherwise.

To learn more about x86 CPU internals, see the x86 tag wiki, and especially Agner Fog's microarch guide.

In some older / simpler CPUs, every instruction was effectively micro-coded. For example, the 6502 executed 6502 instructions by running a sequence of internal instructions from a PLA decode ROM. This works well for a non-pipelined CPU, where the order of using the different parts of the CPU can vary from instruction to instruction.

Historically, there was a different technical meaning for "microcode", meaning something like the internal control signals decoded from the instruction word. Especially in a CPU like MIPS where the instruction word mapped directly to those control signals, without complicated decoding. (I may have this partly wrong; I read something like this (other than in the deleted answer on this question) but couldn't find it again later.)

0人赞添加讨论(0) 举报

What is a microcoded instruction?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间