350 likes | 467 Views
Lecture 2. ARM MPU Subsystem. Cortex-A8 Subsystem. The Microprocessor Unit (MPU) subsystem Handles transactions between the ARM core (ARM® Cortex™-A8 Processor), the L3 interconnect, and the interrupt controller (INTC)
E N D
Lecture 2 ARM MPU Subsystem NCHUEE 720A Lab Prof. Jichiang Tsai
Cortex-A8 Subsystem • The Microprocessor Unit (MPU) subsystem • Handles transactions between the ARM core (ARM® Cortex™-A8 Processor), the L3 interconnect, and the interrupt controller (INTC) • Integrates the ARM® Cortex™-A8 Processor with additional logic for protocol conversion, emulation, interrupt handling, and debug enhancements • Cortex™-A8 is an ARMv7 compatible, dual-issue, in-order execution engine with integrated L1 and L2 caches • With NEON™ SIMD Media Processing Unit • Provides a high processing capability for mobile multimedia acceleration • Communicates through an AXI bus with the AXI2OCP bridge • Receives interrupts from the MPU subsystem interrupt controller (MPU INTC) NCHUEE 720A Lab Prof. Jichiang Tsai
Cortex-A8 Subsystem (cont.) • Includes the VFP (Vector Floating Point) coprocessor which implements the VFPv3 architecture and is fully compliant with IEEE 754 standard • Uses the AXI (Advanced eXtensibleInterface) protocol configured to 128-bit data width • Includes the Embedded Trace Macrocell (ETM) support for debugging • Implements the ARMv7 debug with watch-point and breakpoint registers and 32-bit Advanced Peripheral Bus (APB) slave interface to CoreSight debug systems • AXI2OCP bridge: • Allows communication between the ARM (AXI), the INTC (OCP), and the modules (OCP L3) • I2Async bridge • An asynchronous interface providing an asynchronous OCP (Open Core Protocol) to OCP interface • Between the AXI2OCP bridge within the MPU subsystem and the T2Async bridge external to the MPU subsystem NCHUEE 720A Lab Prof. Jichiang Tsai
Cortex-A8 Subsystem (cont.) • Clock Divider • Provides the required divided clocks to the internal modules • Has a clock input from SYSCLK2 fed by the power, reset, and clock management (PRCM) module • An Interrupt Controller is included • Handles the module interrupts • Support up to 128 interrupt requests • The MPU allow the Debug Sub-system access to the CortexA8 debug and emulation resources, including the Embedded Trace Macrocell • The in-circuit emulator is fully compatible with CoreSight Architecture and enables debugging capabilities • The MPU has three functional clock domains • Including a high-frequency clock domain used by the Cortex™-A8 • The high-frequency domain is isolated from the rest of the system by asynchronous bridges NCHUEE 720A Lab Prof. Jichiang Tsai
Cortex-A8 Subsystem (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
Cortex-A8 Subsystem (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
Clock and Reset Distribution • Clock Distribution • An embedded DPLL (Digital Phase-Locked Loop) sources the clock for the ARM Cortex-A8 processor • A clock divider is used for deriving the clocks for other internal modules • All major modules are clocked at half the frequency of the ARM core. • The divider of the output clock can be programmed • The frequency is relative to the ARM core • The clock generator generates the following functional clocks: • ARM (ARM_FCLK): The core clock • The base fast clock routed internally to the ARM logic and internal RAMs, including NEON, L2 cache, the ETM core (emulation), and the ARM core • AXI2OCP Clock (AXI_FCLK): Half the frequency of ARM_FCLK • The OCP interface thus performs at one half the frequency of ARM NCHUEE 720A Lab Prof. Jichiang Tsai
Clock and Reset Distribution (cont.) • Interrupt Controller Functional Clock (MPU_INTC_FCLK): • Part of the INTC module • Half the frequency of the ARM clock (ARM_FCLK) • ICE-Crusher Functional Clock (ICECRUSHER_FCLK): • Operates on the APB interface, using the ARM core clocking • This clock is half the frequency of the ARM clock (ARM_FCLK) • I2Async Clock (I2ASYNC_FCLK): • Half the frequency of the ARM clock (ARM_FCLK) • Matches the OCP interface of the AXI2OCP bridge • The second half of the asynchronous bridge (T2ASYNC) is clocked directly by the PRCM with the core clock • T2ASYNC is not part of the MPU subsystem. • Emulation Clocking: Distributed by the PRCM module • Asynchronous to the ARM core clock (ARM_FCLK) • Can run at a maximum of 1/3 the ARM core clock NCHUEE 720A Lab Prof. Jichiang Tsai
Clock and Reset Distribution (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
Clock and Reset Distribution (cont.) • Reset Distribution • Resets to the MPU subsystem are provided by the PRCM • Controlled by the clock generator module NCHUEE 720A Lab Prof. Jichiang Tsai
Clock and Reset Distribution (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
ARM Subchip • The ARM Cortex-A8 processor incorporates the technologies available in the ARM7™ architecture • NEON™ for media and signal processing • Jazelle™ RCT for acceleration of realtime compilers • Thumb®-2 technology for code density • The VFPv3 floating point architecture • The AXI bus interface is the main interface to the ARM system bus • Performs L2 cache fills and noncacheable accesses for both instructions and data. • Supports 128bit and 64-bit wide input and output data buses • Supports multiple outstanding requests on the AXI bus NCHUEE 720A Lab Prof. Jichiang Tsai
ARM Subchip (cont.) • Supports a wide range of bus clock to core clock ratios • The bus clock is synchronous with the core clock • Special secure monitor functions are supported • Allows access to certain ARM core registers in privileged mode • Provides functions to write to CP15 Registers • Auxiliary Control Register, Nonsecure Access Control Register, and the L2 Cache Auxiliary Control Register NCHUEE 720A Lab Prof. Jichiang Tsai
ARM Subchip (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
Interrupt Controller • The Host ARM Interrupt Controller (AINTC) is responsible for • Prioritizing all service requests from the system peripherals • Generating either nIRQ or nFIQ to the host • The type of the interrupt (nIRQ or nFIQ) and the priority of the interrupt inputs are programmable. • Via the AXI port through an AXI2OCP bridge • Runs at half the processor speed • Has the capability to handle up to 128 requests • Level sensitive interrupts inputs • Individual priority for each interrupt input • Each interrupt can be steered to nFIQ or nIRQ • Independent priority sorting for nFIQ and nIRQ NCHUEE 720A Lab Prof. Jichiang Tsai
Power Management • The MPU subsystem is divided into 4 power domains • Controlled by the PRCMTIMERS • MPU subsystem domain • ARM, AXI2OCP, I2Asynch Bridge, ARM L1 and L2 periphery logic and array, ICE-Crusher, ETM, APB modules • L1 and L2 array memories have separate control signals into the in MPU Subsystem, thus directly controlled by PRCM • MPU NEON domain • ARM NEON accelerator • CORE domain • MPU interrupt controller • EMU domain • EMU: ETB (Embedded Trace Buffer) and DAP (Debug Access Port) NCHUEE 720A Lab Prof. Jichiang Tsai
Power Management (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
Power Management (cont.) • Each power domain can be driven by the PRCM in 4 different states • Depending on the functional mode required by the user • For each domain, the PRCM manages all transitions • By controlling domain clocks, domain resets, domain logic power switches and memory power switches • The major part of the MPU subsystem belongs to the MPU power domain • The modules inside this power domain can be off at a time when the ARM processor is in an OFF or standby mode NCHUEE 720A Lab Prof. Jichiang Tsai
Power Management (cont.) • IDLE/WAKEUP control is managed by the clock generator block but initiated by the PRCM module • For the MPU to be on, the core power must be on • Device power management does not allow INTC to go to OFF state when MPU domain is on • Active or one of retention modes • The NEON core has independent power off mode when not in use • Enabling and disabling of NEON can be controlled by software • The L1 cache memory does not support retention mode • The ARM L2 can be put into retention independently of the other domains NCHUEE 720A Lab Prof. Jichiang Tsai
Power Management (cont.) • The supported operational power modes • All other combinations are illegal • The ARM L2, NEON, and ETM/Debug can be powered up/down independently • The APB/ATB ETM/Debug column refers to all three features • ARM emulation, trace, and debug • The MPU subsystem must be in a power mode where the MPU power domain, NEON power domain, debug power domain, and INTC power domain are in standby, or off state NCHUEE 720A Lab Prof. Jichiang Tsai
Power Management (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
MPU Power Mode Transitions • Basic Power-On Reset • Initial power-up and wakeup from device off mode • Reset the INTC and the MPU subsystem modules • CORE_RST and MPU_RST • The clocks must be active during the MPU reset and CORE reset • MPU Into Standby Mode • Initial power-up and wakeup from device Off mode • The core initiates entering into standby via software only (CP15 - WFI) • MPU modules requested internally of MPU subsystem to enter idle • MPU is in standby output asserted for PRCM • All outputs guaranteed to be at reset values • PRCM can now request INTC to enter into idle mode • Acknowledge from INTC goes to PRCM NCHUEE 720A Lab Prof. Jichiang Tsai
MPU Power Mode Transitions (cont.) • MPU Out Of Standby Mode • Initial power-up and wakeup from device Off mode • PRCM must start clocks through DPLL programming • Detect active clocking via status output of DPLL • Initiate an interrupt through the INTC to wake up the ARM core from STANDBYWFI mode • MPU Power On From a Powered-Off State • MPU Power On, NEON Power On, Core Power On (INTC) should follow the ordered sequence per power switch daisy chain to minimize the peaking of current during power-up • The core domain must be on, and reset, before the MPU can be reset • Follow the reset sequence as the Basic Power-On Reset NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture • From the programmer’s point of view, the ARM7 can be in one of two states • ARM state executes 32-bit, word-aligned ARM instructions • THUMB state operates with 16-bit, half-word-aligned THUMB instructions • ARM7 views memory as a linear collection of bytes • Numbered upwards from zero • Bytes 0 to 3 hold the first stored word, bytes 4 to 7 the second and so on • ARM7 can treat words in memory as being stored either in Big-Endian or Little-Endianformat • ARM7 supports byte (8-bit), half-word (16-bit) and word (32-bit) data types NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • Operating modes • User (usr): The normal ARM program execution state • FIQ (fiq): To support a data transfer or channel process • IRQ (irq): Used for general-purpose interrupt handling • Supervisor (svc): Protected mode for the operating system • Abort mode (abt): Entered after a data or instruction prefetch abort • System (sys): A privileged user mode for the operating system • Undefined (und): Entered when an undefined instruction is executed • Mode changes may be made under software control, or may be brought about by external interrupts or exception processing NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • Registers • ARM7 has a total of 37 registers • 31 general-purpose 32-bit registers and six status registers • These cannot all be seen at once • The processor state and operating mode dictate which registers are available to the programmer • In ARM state, 16 general registers and one or two status registers are visible at any one time • In privileged (non- User) modes, mode-specific banked registers are switched in. • The ARM state register set contains 16 directly accessible registers • R0 to R15 • All of these except R15 are general-purpose, and may be used to hold either data or address values NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • There is a seventeenth register used to store status information • Register 16 is the CPSR (Current Program Status Register) • This contains condition code flags and the current mode bits • Register 14 is used as the subroutine link register • This receives a copy of R15 when a subroutine is invoked • All other times, it may be treated as a general-purpose register • The corresponding banked registers R14_svc, R14_irq, R14_fiq, R14_abt and R14_und are aslo used to hold the return values of R15 • When interrupts and exceptions arise, or when branch and link instructions are executed within interrupt or exception routines • Register 15 holds the Program Counter (PC) • In ARM state, bits [1:0] of R15 are zero and bits [31:2] contain the PC • In THUMB state, bit [0] is zero and bits [31:1] contain the PC NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • FIQ mode has seven banked registers mapped to R8-R14 (R8_fiq-R14_fiq) • In ARM state, many FIQ handlers do not need to save any registers • in IRQ mode, Supervisor, Abort and Undefined each have two banked registers mapped to R13 and R14 • Allowing each of these modes to have a private stack pointer and link registers • The THUMB state register set is a subset of the ARM state set • The programmer has direct access to eight general registers, R0–R7, as well as the Program Counter (PC), a stack pointer register (SP), a link register (LR), and the CPSR. • There are banked stack pointers, link registers and Saved Process Status Registers (SPSRs) for each privileged mode NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • The ARM7 contains a Current Program Status Register (CPSR) • Plus five Saved Program Status Registers (SPSRs) for use by exception handlers • These register’s functions are • Hold information about the most recently performed ALU operation • Control the enabling and disabling of interrupts • Set the processor operating mode NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • An exception arises when the normal flow of program execution is interrupted • e.g., processing is diverted to handle an interrupt from a peripheral • The processor state just prior to handling the exception must be preserved • The program flow can be resumed when the exception routine is completed • The system uses the banked core registers to save the current state. • The old PC value and the CPSR contents are copied into the appropriate banked R14 (LR) and SPSR registers • The PC and mode bits in the CPSR are adjusted to the value corresponding to the type of exception being processed NCHUEE 720A Lab Prof. Jichiang Tsai
ARM7 Architecture (cont.) • There are seven types of exceptions • Each has a fixed priority and a privileged processor mode NCHUEE 720A Lab Prof. Jichiang Tsai