mpc5632m Freescale Semiconductor, Inc, mpc5632m Datasheet - Page 12

mpc5632m

Manufacturer Part Number

mpc5632m

Description

Mpc5634m Microcontroller Data Sheet

Manufacturer

Freescale Semiconductor, Inc

Datasheet

1.MPC5632M.pdf (112 pages)

Current page: 12 of 112
Download datasheet (3Mb)

Overview

1.3

MPC5634M Feature Details

1.3.1

e200z335 Core

The e200z335 processor utilizes a four stage pipeline for instruction execution. The Instruction Fetch (stage 1), Instruction

Decode/Register file Read/Effective Address Calculation (stage 2), Execute/Memory Access (stage 3), and Register Writeback

(stage 4) stages operate in an overlapped fashion, allowing single clock instruction execution for most instructions.

The integer execution unit consists of a 32-bit Arithmetic Unit (AU), a Logic Unit (LU), a 32-bit Barrel shifter (Shifter), a

Mask-Insertion Unit (MIU), a Condition Register manipulation Unit (CRU), a Count-Leading-Zeros unit (CLZ), a 32×32

Hardware Multiplier array, result feed-forward hardware, and support hardware for division.

Most arithmetic and logical operations are executed in a single cycle with the exception of the divide instructions. A

Count-Leading-Zeros unit operates in a single clock cycle. The Instruction Unit contains a PC incrementer and a dedicated

Branch Address adder to minimize delays during change of flow operations. Sequential prefetching is performed to ensure a

supply of instructions into the execution pipeline. Branch target prefetching is performed to accelerate taken branches.

Prefetched instructions are placed into an instruction buffer capable of holding six instructions.

Branches can also be decoded at the instruction buffer and branch target addresses calculated prior to the branch reaching the

instruction decode stage, allowing the branch target to be prefetched early. When a branch is detected at the instruction buffer,

a prediction may be made on whether the branch is taken or not. If the branch is predicted to be taken, a target fetch is initiated

and its target instructions are placed in the instruction buffer following the branch instruction. Many branches take zero cycle

to execute by using branch folding. Branches are folded out from the instruction execution pipe whenever possible. These

include unconditional branches and conditional branches with condition codes that can be resolved early.

Conditional branches which are not taken and not folded execute in a single clock. Branches with successful target prefetching

which are not folded have an effective execution time of one clock. All other taken branches have an execution time of two

clocks. Memory load and store operations are provided for byte, halfword, and word (32-bit) data with automatic zero or sign

extension of byte and halfword load data as well as optional byte reversal of data. These instructions can be pipelined to allow

effective single cycle throughput. Load and store multiple word instructions allow low overhead context save and restore

operations. The load/store unit contains a dedicated effective address adder to allow effective address generation to be

optimized. Also, a load-to-use dependency does not incur any pipeline bubbles for most cases.

The Condition Register unit supports the condition register (CR) and condition register operations defined by the Power

Architecture. The condition register consists of eight 4-bit fields that reflect the results of certain operations, such as move,

integer and floating-point compare, arithmetic, and logical instructions, and provide a mechanism for testing and branching.

Vectored and autovectored interrupts are supported by the CPU. Vectored interrupt support is provided to allow multiple

interrupt sources to have unique interrupt handlers invoked with no software overhead.

The hardware floating-point unit utilizes the IEEE-754 single-precision floating-point format and supports single-precision

floating-point operations in a pipelined fashion. The general purpose register file is used for source and destination operands,

thus there is a unified storage model for single-precision floating-point data types of 32 bits and the normal integer type.

Single-cycle floating-point add, subtract, multiply, compare, and conversion operations are provided. Divide instructions are

multi-cycle and are not pipelined.

The Signal Processing Extension (SPE) Auxiliary Processing Unit (APU) provides hardware SIMD operations and supports a

full complement of dual integer arithmetic operation including Multiply Accumulate (MAC) and dual integer multiply (MUL)

in a pipelined fashion. The general purpose register file is enhanced such that all 32 of the GPRs are extended to 64 bits wide

and are used for source and destination operands, thus there is a unified storage model for 32 x 32 MAC operations which

generate greater than 32-bit results.

The majority of both scalar and vector operations (including MAC and MUL) are executed in a single clock cycle. Both scalar

and vector divides take multiple clocks. The SPE APU also provides extended load and store operations to support the transfer

of data to and from the extended 64-bit GPRs. This SPE APU is fully binary compatible with e200z6 SPE APU used in

MPC5554 and MPC5553.

MPC5634M Microcontroller Data Sheet, Rev. 3

Preliminary—Subject to Change Without Notice

Freescale Semiconductor

mpc5632m Freescale Semiconductor, Inc, mpc5632m Datasheet - Page 12

mpc5632m

Related parts for mpc5632m