mpc5632m Freescale Semiconductor, Inc, mpc5632m Datasheet - Page 12

no-image

mpc5632m

Manufacturer Part Number
mpc5632m
Description
Mpc5634m Microcontroller Data Sheet
Manufacturer
Freescale Semiconductor, Inc
Datasheet
Overview
1.3
MPC5634M Feature Details
1.3.1
e200z335 Core
The e200z335 processor utilizes a four stage pipeline for instruction execution. The Instruction Fetch (stage 1), Instruction
Decode/Register file Read/Effective Address Calculation (stage 2), Execute/Memory Access (stage 3), and Register Writeback
(stage 4) stages operate in an overlapped fashion, allowing single clock instruction execution for most instructions.
The integer execution unit consists of a 32-bit Arithmetic Unit (AU), a Logic Unit (LU), a 32-bit Barrel shifter (Shifter), a
Mask-Insertion Unit (MIU), a Condition Register manipulation Unit (CRU), a Count-Leading-Zeros unit (CLZ), a 32×32
Hardware Multiplier array, result feed-forward hardware, and support hardware for division.
Most arithmetic and logical operations are executed in a single cycle with the exception of the divide instructions. A
Count-Leading-Zeros unit operates in a single clock cycle. The Instruction Unit contains a PC incrementer and a dedicated
Branch Address adder to minimize delays during change of flow operations. Sequential prefetching is performed to ensure a
supply of instructions into the execution pipeline. Branch target prefetching is performed to accelerate taken branches.
Prefetched instructions are placed into an instruction buffer capable of holding six instructions.
Branches can also be decoded at the instruction buffer and branch target addresses calculated prior to the branch reaching the
instruction decode stage, allowing the branch target to be prefetched early. When a branch is detected at the instruction buffer,
a prediction may be made on whether the branch is taken or not. If the branch is predicted to be taken, a target fetch is initiated
and its target instructions are placed in the instruction buffer following the branch instruction. Many branches take zero cycle
to execute by using branch folding. Branches are folded out from the instruction execution pipe whenever possible. These
include unconditional branches and conditional branches with condition codes that can be resolved early.
Conditional branches which are not taken and not folded execute in a single clock. Branches with successful target prefetching
which are not folded have an effective execution time of one clock. All other taken branches have an execution time of two
clocks. Memory load and store operations are provided for byte, halfword, and word (32-bit) data with automatic zero or sign
extension of byte and halfword load data as well as optional byte reversal of data. These instructions can be pipelined to allow
effective single cycle throughput. Load and store multiple word instructions allow low overhead context save and restore
operations. The load/store unit contains a dedicated effective address adder to allow effective address generation to be
optimized. Also, a load-to-use dependency does not incur any pipeline bubbles for most cases.
The Condition Register unit supports the condition register (CR) and condition register operations defined by the Power
Architecture. The condition register consists of eight 4-bit fields that reflect the results of certain operations, such as move,
integer and floating-point compare, arithmetic, and logical instructions, and provide a mechanism for testing and branching.
Vectored and autovectored interrupts are supported by the CPU. Vectored interrupt support is provided to allow multiple
interrupt sources to have unique interrupt handlers invoked with no software overhead.
The hardware floating-point unit utilizes the IEEE-754 single-precision floating-point format and supports single-precision
floating-point operations in a pipelined fashion. The general purpose register file is used for source and destination operands,
thus there is a unified storage model for single-precision floating-point data types of 32 bits and the normal integer type.
Single-cycle floating-point add, subtract, multiply, compare, and conversion operations are provided. Divide instructions are
multi-cycle and are not pipelined.
The Signal Processing Extension (SPE) Auxiliary Processing Unit (APU) provides hardware SIMD operations and supports a
full complement of dual integer arithmetic operation including Multiply Accumulate (MAC) and dual integer multiply (MUL)
in a pipelined fashion. The general purpose register file is enhanced such that all 32 of the GPRs are extended to 64 bits wide
and are used for source and destination operands, thus there is a unified storage model for 32 x 32 MAC operations which
generate greater than 32-bit results.
The majority of both scalar and vector operations (including MAC and MUL) are executed in a single clock cycle. Both scalar
and vector divides take multiple clocks. The SPE APU also provides extended load and store operations to support the transfer
of data to and from the extended 64-bit GPRs. This SPE APU is fully binary compatible with e200z6 SPE APU used in
MPC5554 and MPC5553.
MPC5634M Microcontroller Data Sheet, Rev. 3
12
Preliminary—Subject to Change Without Notice
Freescale Semiconductor

Related parts for mpc5632m