SPC563Mxx Device overview
Doc ID 13850 Rev 6 19/48
3.3 Feature details
3.3.1 e200z335 core
The e200z335 processor utilizes a four stage pipeline for instruction execution. The
Instruction Fetch (stage 1), Instruction Decode/Register file Read/Effective Address
Calculation (stage 2), Execute/Memory Access (stage 3), and Register Writeback (stage 4)
stages operate in an overlapped fashion, allowing single clock instruction execution for most
instructions.
The integer execution unit consists of a 32-bit Arithmetic Unit (AU), a Logic Unit (LU), a 32-
bit Barrel shifter (Shifter), a Mask-Insertion Unit (MIU), a Condition Register manipulation
Unit (CRU), a Count-Leading-Zeros unit (CLZ), a 32×32 Hardware Multiplier array, result
feed-forward hardware, and support hardware for division.
Most arithmetic and logical operations are executed in a single cycle with the exception of
the divide instructions. A Count-Leading-Zeros unit operates in a single clock cycle. The
Instruction Unit contains a PC incrementer and a dedicated Branch Address adder to
minimize delays during change of flow operations. Sequential prefetching is performed to
ensure a supply of instructions into the execution pipeline. Branch target prefetching is
performed to accelerate taken branches. Prefetched instructions are placed into an
instruction buffer capable of holding six instructions.
Branches can also be decoded at the instruction buffer and branch target addresses
calculated prior to the branch reaching the instruction decode stage, allowing the branch
target to be prefetched early. When a branch is detected at the instruction buffer, a
prediction may be made on whether the branch is taken or not. If the branch is predicted to
be taken, a target fetch is initiated and its target instructions are placed in the instruction
buffer following the branch instruction. Many branches take zero cycle to execute by using
branch folding. Branches are folded out from the instruction execution pipe whenever
possible. These include unconditional branches and conditional branches with condition
codes that can be resolved early.
Conditional branches which are not taken and not folded execute in a single clock. Branches
with successful target prefetching which are not folded have an effective execution time of
one clock. All other taken branches have an execution time of two clocks. Memory load and
store operations are provided for byte, halfword, and word (32-bit) data with automatic zero
or sign extension of byte and halfword load data as well as optional byte reversal of data.
These instructions can be pipelined to allow effective single cycle throughput. Load and
store multiple word instructions allow low overhead context save and restore operations.
The load/store unit contains a dedicated effective address adder to allow effective address
generation to be optimized. Also, a load-to-use dependency does not incur any pipeline
bubbles for most cases.
The Condition Register unit supports the condition register (CR) and condition register
operations defined by the Power Architecture. The condition register consists of eight 4-bit
fields that reflect the results of certain operations, such as move, integer and floating-point
compare, arithmetic, and logical instructions, and provide a mechanism for testing and
branching. Vectored and autovectored interrupts are supported by the CPU. Vectored
interrupt support is provided to allow multiple interrupt sources to have unique interrupt
handlers invoked with no software overhead.
The hardware floating-point unit utilizes the IEEE-754 single-precision floating-point format
and supports single-precision floating-point operations in a pipelined fashion. The general
purpose register file is used for source and destination operands, thus there is a unified
Device overview SPC563Mxx
20/48 Doc ID 13850 Rev 6
storage model for single-precision floating-point data types of 32 bits and the normal integer
type. Single-cycle floating-point add, subtract, multiply, compare, and conversion operations
are provided. Divide instructions are multi-cycle and are not pipelined.
The Signal Processing Extension (SPE) Auxiliary Processing Unit (APU) provides hardware
SIMD operations and supports a full complement of dual integer arithmetic operation
including Multiply Accumulate (MAC) and dual integer multiply (MUL) in a pipelined fashion.
The general purpose register file is enhanced such that all 32 of the GPRs are extended to
64 bits wide and are used for source and destination operands, thus there is a unified
storage model for 32×32 MAC operations which generate greater than 32-bit results.
The majority of both scalar and vector operations (including MAC and MUL) are executed in
a single clock cycle. Both scalar and vector divides take multiple clocks. The SPE APU also
provides extended load and store operations to support the transfer of data to and from the
extended 64-bit GPRs.
The CPU includes support for Variable Length Encoding (VLE) instruction enhancements.
This enables the classic Power Architecture instruction set to be represented by a modified
instruction set made up from a mixture of 16- and 32-bit instructions. This results in a
significantly smaller code size footprint without noticeably affecting performance. The Power
Architecture instruction set and VLE instruction set are available concurrently. Regions of
the memory map are designated as PPC or VLE using an additional configuration bit in
each of Table Look-aside Buffers (TLB) entries in the MMU.
The CPU core is enhanced by the addition of two additional interrupt sources; Non-
Maskable Interrupt and Critical Interrupt. These two sources are routed directly from
package pins, via edge detection logic in the SIU to the CPU, bypassing completely the
Interrupt Controller. Once the edge detection logic is programmed, it cannot be disabled,
except by reset. The non-maskable Interrupt is, as the name suggests, completely un-
maskable and when asserted will always result in the immediate execution of the respective
interrupt service routine. The non-maskable interrupt is not guaranteed to be recoverable.
The Critical Interrupt is very similar to the non-maskable interrupt, but it can be masked by
other exceptional interrupts in the CPU and is guaranteed to be recoverable (code execution
may be resumed from where it stopped).
The CPU core has an additional ‘Wait for Interrupt’ instruction that is used in conjunction
with low power STOP mode. When Low Power Stop mode is selected, this instruction is
executed to allow the system clock to be stopped. An external interrupt source or the system
wake-up timer is used to restart the system clock and allow the CPU to service the interrupt.
3.3.2 Crossbar
The XBAR multi-port crossbar switch supports simultaneous connections between three
master ports and four slave ports. The crossbar supports a 32-bit address bus width and a
64-bit data bus width.
The crossbar allows three concurrent transactions to occur from the master ports to any
slave port; but each master must access a different slave. If a slave port is simultaneously
requested by more than one master port, arbitration logic selects the higher priority master
and grants it ownership of the slave port. All other masters requesting that slave port are
stalled until the higher priority master completes its transactions. Requesting masters are
treated with equal priority and are granted access to a slave port in round-robin fashion,
SPC563Mxx Device overview
Doc ID 13850 Rev 6 21/48
based upon the ID of the last master to be granted access. The crossbar provides the
following features:
3 master ports:
e200z335 core complex Instruction port
e200z335 core complex Load/Store port
–eDMA
4 slave ports
FLASH
calibration bus
–SRAM
Peripheral bridge A/B (eTPU2, eMIOS, SIU, DSPI, eSCI, FlexCAN, eQADC, BAM,
decimation filter, PIT, STM and SWT)
32-bit internal address, 64-bit internal data paths
3.3.3 eDMA
The enhanced direct memory access (eDMA) controller is a second-generation module
capable of performing complex data movements via 32 programmable channels, with
minimal intervention from the host processor. The hardware micro architecture includes a
DMA engine which performs source and destination address calculations, and the actual
data movement operations, along with an SRAM-based memory containing the transfer
control descriptors (TCD) for the channels. This implementation is utilized to minimize the
overall block size. The eDMA module provides the following features:
All data movement via dual-address transfers: read from source, write to destination
Programmable source and destination addresses, transfer size, plus support for
enhanced addressing modes
Transfer control descriptor organized to support two-deep, nested transfer operations
An inner data transfer loop defined by a “minor” byte transfer count
An outer data transfer loop defined by a “major” iteration count
Channel activation via one of three methods:
Explicit software initiation
Initiation via a channel-to-channel linking mechanism for continuous transfers
Peripheral-paced hardware requests (one per channel)
Support for fixed-priority and round-robin channel arbitration
Channel completion reported via optional interrupt requests
1 interrupt per channel, optionally asserted at completion of major iteration count
Error termination interrupts are optionally enabled
Support for scatter/gather DMA processing
Channel transfers can be suspended by a higher priority channel
3.3.4 Interrupt controller
The INTC (interrupt controller) provides priority-based preemptive scheduling of interrupt
requests, suitable for statically scheduled hard real-time systems. The INTC allows interrupt
request servicing from up to 191 peripheral interrupt request sources, plus 165 sources
reserved for compatibility with other family members).

SPC563M64L5COBR

Mfr. #:
Manufacturer:
STMicroelectronics
Description:
32-bit Microcontrollers - MCU 32-bit Pwr Architect MCU Auto PwrTrainApp
Lifecycle:
New from this manufacturer.
Delivery:
DHL FedEx Ups TNT EMS
Payment:
T/T Paypal Visa MoneyGram Western Union