T2080NSE7PTB Datasheet T2080NSN7MQB, T2081NSN7MQB, T2080NSE7TTB | P22-P24

Safe IO sharing can be accomplished through the use of a hypervisor; however, there is a performance penalty associated

with virtual IO, as the hypervisor must consume CPU cycles to schedule the IO requests and get the results back to the right

software partition.

The DPAA (described in Data Path Acceleration Architecture (DPAA)") was designed to allow multiple partitions to

efficiently share accelerators and IOs, with its major capabilities centered around sharing Ethernet ports. These capabilities

were enhanced in the chip with the addition of FMan storage profiles. The chip's FMans perform classification prior to buffer

pool selection, allowing Ethernet frames arriving on a single port to be written to the dedicated memory of a single software

partition. This capability is fully described in Receiver functionality: parsing, classification, and distribution."

The addition of the RMan extends the chip's IO virtualization by allowing many types of traffic arriving on Serial RapidIO to

enter the DPAA and take advantage of its inherent virtualization and partitioning capabilities.

The PCI Express protocol lacks the PDU semantics found in Serial RapidIO, making it difficult to interwork between PCI

Express controllers and the DPAA; however, PCI Express has made progress in other areas of partition. The Single Root IO

Virtualization specification, which the chip supports as an endpoint, allows external hosts to view the chip as multiple four

physical functions (PFs), where each PF supports up to 64 virtual functions (VFs). Having multiple VFs on a PCI Express

port effectively channelizes it, so that each transaction through the port is identified as belonging to a specific PF/VF

combination (with associated and potentially dedicated memory regions). Message signalled interrupts (MSIs) allow the

external Host to generate interrupts associated with a specific VF.

4.13.4 Secure boot and sensitive data protection

The core MMUs and PAMU allow the SoC to enforce a consistent set of memory access permissions on a per-partition basis.

When combined with an embedded hypervisor for safe sharing of resources, the SoC becomes highly resilient to poorly

tested or malicious code. For system developers building high reliability/high security platforms, rigorous testing of code of

known origin is the norm.

For this reason, the SoC offers a secure boot option, in which the system developer digitally signs the code to be executed by

the CPUs, and the SoC insures that only an unaltered version of that code runs on the platform. The SoC offers both boot

time and run time code authenticity checking, with configurable consequences when the authenticity check fails. The SoC

also supports protected internal and external storage of developer-provisioned sensitive instructions and data. For example, a

system developer may provision each system with a number of RSA private keys to be used in mutual authentication and key

exchange. These values would initially be stored as encrypted blobs in external non-volatile memory; but, following secure

boot, these values can be decrypted into on-chip protected memory (portion of platform cache dedicated as SRAM). Session

keys, which may number in the thousands to tens of thousands, are not good candidates for on-chip storage, so the SoC offers

session key encryption. Session keys are stored in main memory, and are decrypted (transparently to software and without

impacting SEC throughput) as they are brought into the for decryption of session traffic.

4.14 Advanced power management

Power dissipation is always a major design consideration in embedded applications; system designers need to balance the

desire for maximum compute and IO density against single-chip and board-level thermal limits.

Advances in chip and board level cooling have allowed many OEMs to exceed the traditional 30 W limit for a single chip,

and Freescale's flagship T4240 multicore chip, has consequently retargeted its maximum power dissipation. A top-speed bin

T4240 dissipates approximately 2x the power dissipation of the P4080; however, the T4240 increases computing

performance by ~4x, yielding a 2x improvement in DMIPs per watt.

Junction temperature is a critical factor in comparing embedded processor specifications. Freescale specs max power at 105C

junction, standard for commercial, embedded operating conditions. Not all multicore chips adhere to a 105C junction for

specifying worst case power. In the interest of normalizing power comparisons, the chip's typical and worst case power (all

CPUs at 1.8 GHz) are shown at alternate junction temperatures.

Chip features

T2080 Product Brief, Rev 0, 04/2014

22 Freescale Semiconductor, Inc.

To achieve the previously-stated 2x increase in performance per watt, the chip implements a number of software transparent

and performance transparent power management features. Non-transparent power management features are also available,

allowing for significant reductions in power consumption when the chip is under lighter loads; however, non-transparent

power savings are not assumed in chip power specifications.

4.14.1 Transparent power management

This chip's commitment to low power begins with the decision to fabricate the chip in 28 nm bulk CMOS. This process

technology offers low leakage, reducing both static and dynamic power. While 28 nm offers inherent power savings,

transistor leakage varies from lot to lot and device to device. Leakier parts are capable of faster transistor switching, but they

also consume more power. By running devices from the leakier end of the process spectrum at less than nominal voltage and

devices from the slower end of the process spectrum at higher nominal voltage, T2080-based systems can achieve the

required operating frequency within the specified max power. During manufacturing, Freescale will determine the voltage

required to achieve the target frequency bin and program this Voltage ID into each device, so that initialization software can

program the system's voltage regulator to the appropriate value.

Dynamic power is further reduced through fine-grained clock control. Many components and subcomponents in the chip

automatically sleep (turn off their clocks) when they are not actively processing data. Such blocks can return to full operating

frequency on the clock cycle after work is dispatched to them. A portion of these dynamic power savings are built into the

chip max power specification on the basis of impossibility of all processing elements and interfaces in the chip switching

concurrently. The percent switching factors are considered quite conservative, and measured typical power consumption on

QorIQ chips is well below the maximum in the data sheet.

As noted in Frame Manager and network interfaces, the chip supports Energy-Efficient Ethernet. During periods of extended

inactivity on the transmit side, the chip transparently sends a low power idle (LPI) signal to the external PHY, effectively

telling it to sleep.

Additional power savings can be achieved by users statically disabling unused components. Developers can turn off the

clocks to individual logic blocks (including CPUs) within the chip that the system is not using. Based on a finite number of

SerDes, it is expected that any given application will have some inactive Ethernet MACs, PCI Express, or serial RapidIO

controllers. Re-enabling clocks to a logic block generally requires an chip reset, which makes this type of power management

infrequent (effectively static) and transparent to runtime software.

4.14.2 Non-transparent power management

Many load-based power savings are use-case specific static configurations (thereby software transparent), and were described

in the previous section. This section focuses on SoC power management mechanisms, which software can dynamically

leverage to reduce power when the system is lightly loaded. The most important of these mechanisms involves the cores.

A full description of core low-power states with proper names is provided in the SoC reference manual. At a high level, the

most important of these states can be viewed as "PH10" and "PH20," described as follows. Note that these are relative terms,

which do not perfectly correlate to previous uses of these terms in Power Architecture and other ISAs:

• In PH10 state CPU stops instruction fetches but still performs L1 snoops. The CPU retains all state, and instruction

fetching can be restarted instantly.

• In PH20 state CPU stops instruction fetches and L1 snooping, and turns off all clocks. Supply voltage is reduced, using

a technique Freescale calls State Retention Power Gating (SRPG). In the "napping" state, a CPU uses ~75% less power

than a fully operational CPU, but can still return to full operation quickly (~100 platform clocks).

The core offers two ways to enter these (and other) low power states: registers and instructions.

As the name implies, register-based power management means that software writes to registers to select the CPU and its low

power state. Any CPU with write access to power management registers can put itself, or another CPU, into a low power

state; however, a CPU put into a low power state by way of register write cannot wake itself up.

Chip features

T2080 Product Brief, Rev 0, 04/2014

Freescale Semiconductor, Inc. 23

Instruction-based power management means that software executes special WAIT instruction to enter a low power state.

CPUs exit the low power state in response to external triggers, interrupts, doorbells, stashes into L1-D cache, or clear

reservation on snoop. Each vCPU can independently execute WAIT instructions; however, the physical CPU enters PH20

state after the second vCPU executes its wait. The instruction-based "enters PH20 state" state is particularly well-suited for

use in conjunction with Freescale's patented Cascade Power Management, which is described in the next section.

While significant power savings can be achieved through individual CPU low power states, the SoC also supports a register-

based cluster level low power state. After software puts all CPUs in a cluster in a PH10 state, it can additionally flush the L2

cache and have the entire cluster enter PH20 state . Because the L2 arrays have relatively low static power dissipation, this

state provides incremental additional savings over having four napping CPUs with the L2 on.

4.14.3 Cascade power management

Cascade power management refers to the concept of allowing SoC load, as defined by the depth of queues managed by the

Queue Manager, to determine how many vCPUs need to be awake to handle the load. Recall from Queue Manager that the

QMan supports both dedicated and pool channels. Pool channels are channels of frame queues consumed by parallel workers

(vCPUs), where any worker can process any packet dequeued from the channel.

Cascade Power Management exploits the QMan's awareness of vCPU membership in a pool channel and overall pool

channel queue depth. The QMan uses this information to tell vCPUs in a pool channel (starting with the highest numbered

vCPU) that they can execute instructions to enter PH10 mode. When pool channel queue depth exceeds configurable

thresholds, the QMan wakes up the lowest numbered vCPU.

The SoC's dynamic power management capabilities, whether using the Cascade scheme or a master control CPU and load to

power matching software, enable up to a 75% reduction in power consumption versus data sheet max power.

4.15 Debug support

The reduced number of external buses enabled by the move to multicore chips greatly simplifies board level lay-out and

eliminates many concerns over signal integrity. While the board designer may embrace multicore CPUs, software engineers

have real concerns over the potential to lose debug visibility.

Processing on a multicore chip with shared caches and peripherals also leads to greater concurrency and an increased

potential for unintended interactions between device components. To ensure that software developers have the same or better

visibility into the device as they would with multiple discrete communications processors, Freescale developed an Advanced

Multicore Debug Architecture.

The debugging and performance monitoring capability enabled by the device hardware coexists within a debug ecosystem

that offers a rich variety of tools at different levels of the hardware/software stack. Software development and debug tools

from Freescale (CodeWarrior), as well as third-party vendors, provide a rich set of options for configuring, controlling, and

analyzing debug and performance related events.

Appendix A T2081

A.1 Introduction

The T2081 QorIQ advanced, multicore processor combines four, dual-threaded e6500 Power Architecture® processor cores

with high-performance datapath acceleration logic and network and peripheral bus interfaces required for networking,

telecom/datacom, wireless infrastructure, and mil/aerospace applications.

This figure shows the major functional units within the chip.

Introduction

T2080 Product Brief, Rev 0, 04/2014

24 Freescale Semiconductor, Inc.

P1-P3 P4-P6 P7-P9 P10-P12 P13-P15 P16-P18 P19-P21 P22-P24 P25-P27 P28-P29

T2080NSE7PTB

Mfr. #:

Buy T2080NSE7PTB

Manufacturer:

NXP Semiconductors

Description:

Microprocessors - MPU QorIQ, 64b Power Arch, 8x 1.5GHz threads, 1.87GT/s DDR3/3L, 4x10GE, crypto enabled, 0-105C, Rev 1.1

Lifecycle:

New from this manufacturer.

Delivery:

DHL FedEx Ups TNT EMS

Payment:

T/T Paypal Visa MoneyGram Western Union

T2080NSE7PTB

T2080NSE7PTB

Products related to this Datasheet