Understanding branch prediction

2020-03-26 09:41发布

问题:

There are some queries about branch prediction that I am not able to confidently figure out.Assume that I have to work with a static branch predictor.

  1. At which stage of the pipeline should branch prediction happen?
  2. How to know that a prediction has gone wrong? How does the datapath come to know that a misprediction has happened?
  3. If it comes to know that a misprediction has happened, how does it send the signal to take up the not-taken branch?
  4. After it has gone wrong, I have to take up that address that was not taken earlier. In the meanwhile, what if some memory-write or register-write has happened? How to prevent it from happening?

It will be very helpful even if some proper references with datapath in them are suggested. Thanks in advance.

回答1:

I took my time reading the reference manual for the Cortex-A8: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344k/DDI0344K_cortex_a8_r3p2_trm.pdf

From section 5.1:

The processor contains program flow prediction hardware, also known as branch prediction. With program flow prediction disabled, all taken branches incur a 13-cycle penalty. With program flow prediction enabled, all mispredicted branches incur a 13-cycle penalty.

Basically this means that static branch prediction always assume branches to be false. This is different compared to PowerPC that have "special instructions" for hinting the processor about taken/not-taken branches (postfix +/-).

From section 1.3.1:

The instruction fetch unit predicts the instruction stream, fetches instructions from the L1 instruction cache, and places the fetched instructions into a buffer for consumption by the decode pipeline.

  1. Instruction Fetch, the first stage, makes the prediction.

From section 7.6.2:

An instruction can remain in the pipeline between being fetched and being executed. Because there can be several unresolved branches in the pipeline, instruction fetches are speculative, meaning there is no guarantee that they are executed. A branch or exceptional instruction in the code stream can cause a pipeline flush, discarding the currently fetched instructions. Fetches or instruction table walks that begin without an empty pipeline are marked speculative. If the pipeline contains any instruction up to the point of branch and exception resolution, then the pipeline is considered not empty.

I interpret this as nothing reaches the execution stage while a branch is being processed. If mispredition occurs, as discovered when executing a branch in Instruction Execute, all instructions in the pipeline are "flushed". They are never executed. That should answer question 2 and 4. Not so sure about how the "marking" is performed.

  1. I don´t know how it sends the signal. As far as I can tell the reference manual does not cover that part. Guess it´s magic.

(For the record I find the PowerPC reference manuals (e500/e600) I´m used to being much easier to understand because of the many instruction timing samples.)



回答2:

I guess that there are many different mechanisms that are possible, but some quick answers:

  1. Branch prediction certainly needs to happen before the instructions are decoded, during the fetch stages. Otherwise, you're going to decode instructions that are not correct.
  2. You will normally give extra information with the branch instruction that was predicted, like the target that was predicted. The branch will be executed, and if the real target does not match the predicted target, you will need to flush the pipe.
  3. It really depends on the implementation. If the branch is executed, you can use the real target, like a branch that was not predicted.
  4. You certainly need a mechanism to recover, or wait for the branches to be resolved until you write the results. This will loose some time, but not as much as a branch that was not predicted.