AN2203 Freescale Semiconductor / Motorola, AN2203 Datasheet - Page 17

no-image

AN2203

Manufacturer Part Number
AN2203
Description
MPC7450 RISC Microprocessor Family Software Optimization Guide
Manufacturer
Freescale Semiconductor / Motorola
Datasheet

Available stocks

Company
Part Number
Manufacturer
Quantity
Price
Part Number:
AN22030A
Manufacturer:
PANASONIC/松下
Quantity:
20 000
Part III
MPC7450 Microprocessor Details
This section describes many architectural details of the MPC7450 and gives examples of the pipeline
behavior. These attributes are also described in the MPC7450 RISC Microprocessor Family User’s Manual.
3.1
The following is a list of branch instructions and the resources required to avoid stalling the fetch unit in the
course of branch resolution:
3.1.1
Branches that target an instruction at or near the end of a cache block can cause instruction supply problems.
Consider a tight loop branch where the loop entry point is the last word of the cache block, and the loop
contains a total of four instructions (including the branch). For this code, any MPC750/MPC7400 class
machine needs at least two cycles to fetch the four instructions, because the cache block boundary breaks
the fetch group into two groups of accesses. For the MPC750/MPC7400, realigning this loop to not cross
the cache block boundary significantly increases the instruction supply.
Additionally, on the MPC7450 this tight loop encounters the branch-taken bubble problem. That is, the
BTIC supplies instructions one cycle after the branch executes. For the instructions in the cache block
crossing case, four instructions are fetched every three cycles. Aligning instructions to be within a cache
block increases the number of instructions fetched to four every two cycles. For loops with more
instructions, this branch-taken bubble overhead can be better amortized or in some cases can disappear
(because the branch is executed early and the bubble disappears by the time the instructions reach the
dispatch point). One way to increase the number of instructions per branch is software loop unrolling.
In future generations of these high performance microprocessors, expect a further bias: instruction fetch
groupings that do not cross quad-word boundaries are preferable. In particular, this means that branch
targets should be biased to be the first instruction in a quad word (instruction address = 0xxxxx_xxx0) when
optimizing for performance (as opposed to code footprint).
MOTOROLA
The bclr instruction requires LR availability for resolution. However, it uses the link stack to
predict the target address in order to avoid stalling fetch.
The bcctr instruction requires CTR availability.
The branch conditional on counter decrement and the CR condition requires CTR availability or
the CR condition must be false.
A fourth conditional branch instruction cannot be executed following three unresolved predicted
branch instructions.
Fetch/Branch Considerations
Fetching
The BTIC on all MPC750/MPC7400/MPC7450 microprocessors contain
targets for only b and bc branches. Indirect branches (bcctr and bclr) must
go to the instruction cache for instructions, which incurs an additional
cycle of fetch latency (another branch-taken bubble).
MPC7450 RISC Microprocessor Family Software Optimization Guide
Freescale Semiconductor, Inc.
For More Information On This Product,
Go to: www.freescale.com
NOTE
Fetch/Branch Considerations
17

Related parts for AN2203