400 likes | 590 Views
ARM Processor Overview. Prof . Taeweon Suh Computer Science Education Korea University. ARM (www.arm.com) . ARM. Source: 2008 Embedded SW Insight Conference . ARM Partners. Source: 2008 Embedded SW Insight Conference . ARM (as of 2008). Source: 2008 Embedded SW Insight Conference .
E N D
ARM Processor Overview Prof. Taeweon Suh Computer Science Education Korea University
ARM Source: 2008 Embedded SW Insight Conference
ARM Partners Source: 2008 Embedded SW Insight Conference
ARM (as of 2008) Source: 2008 Embedded SW Insight Conference
ARM Brief • ARM architecture was first developed in the 1980s by Acorn • Spin off from Acron in 1990 • Released ARM6 in early 1992 • … • As of 2013, ARM architecture is the most widely used 32-bit ISA in terms of quantity produced • In 2010 alone, 6.1 billion ARM-based processors shipped, representing • 95% of smartphones • 35% of digital TV and set-top boxes • 10% of mobile computers Source: Wikipedia
ARM Architecture • ARM is RISC (Reduced Instruction Set Computer) • x86 ISA is based on CISC (Complex Instruction Set Computer) even though x86 internally implements RISC-like microcode and pipelining • Suitable for embedded systems • Very small die size (low price) • Low power consumption (longer battery life)
ARM Processor Portfolio Source: 2008 Embedded SW Insight Conference
Product Code • T: Thumb • T2: Thumb-2 Enhancement • D: Debug • M: Multiplier • I: Embedded ICE (In-Circuit Emulation) • E: Enhanced DPS Extension • J: Jazelle • Direct execution of 8-bit Java bytecode in hardware • S: Synthesizable core • Z: Should be TrustZone?
ARM Cortex Series • ARM Cortex-A family: • Applications processors for feature-rich OS and 3rd party applications • ARM Cortex-R family: • Embedded processors for real-time signal processing, control applications • ARM Cortex-M family: • Microcontroller-oriented processors for MCU, ASSP, and SoC applications Unparalleled Applicability ...2.5GHz x1-4 Cortex-A15 x1-4 Cortex-A9 Cortex-A8 x1-4 Cortex-A5 1-2 Cortex-R7 1-2 Cortex-R5 Cortex-R4 Cortex-M4 SC300 Cortex-M3 Cortex-M1 SC000 12k gates... Cortex-M0 Source: ARM Processor Portfolio 2011
ARMv7-A ACP: Accelerator Coherency Port SCU: Snoop Control Unit www.arm.com
ARM Processor Brief OOO: Out Of Order
ARM Instruction Overview • ARM is a RISC machine, so the instruction length is fixed • In ARM mode, instructions are 32-bit wide • In Thumb mode, instructions are 16-bit wide • Most ARM instructions can be conditionally executed • It means that they have their normal effect only if the N (Negative), Z (Zero), C (Carry) and V (Overflow) flags in the CPSR satisfy a condition specified in the instruction • If the flags do not satisfy this condition, the instruction acts as a NOP (No Operation) • In other words, the instruction has no effect and advances to the next instruction
ARM Instructions • For the complete instruction set, refer to the “ARM Architecture Reference Manual” • We are going to cover essential and important instructions in this course • If you completely understand one CPU, it is pretty straightforward to understand other CPUs
Essential Instructions • Instruction categories • Data processing instructions: add, sub, cmp, and, or • Memory access instructions: ldr, str • Branch instructions: b, bl • Miscellaneous instructions: CPU Main Memory (DDR) FSB (Front-Side Bus) North Bridge Memory (Instruction, data) Real-PC system ARM CPU Address Bus DMI (Direct Media I/F) Simplified South Bridge Data Bus
A Memory Hierarchy DDR3 HDD 2nd Gen. Core i7 (2011)
A Memory Hierarchy lower level higher level Secondary Storage (Disk) On-Chip Components Main Memory (DRAM) L3 CPU Core L2 L1I (Instr ) Reg File L1D (Data) Speed (cycles): ½’s 1’s 10’s 100’s 10,000’s Size (bytes): 100’s 10K’s M’s G’s T’s Cost: highest lowest
ARM Registers • ARM has 31 general purpose registers and 6 status registers (32-bit each)
ARM Registers • Unbanked registers: R0 ~ R7 • Each of them refers to the same 32-bit physical register in all processor modes. • They are completely general-purpose registers, with no special uses implied by the architecture • Banked registers: R8 ~ R14 • R8 ~ R12 have no dedicated special purposes • FIQ mode has dedicated registers for fast interrupt processing • R13 and R14 are dedicated for special purposes for each mode
R13, R14, and R15 • Some registers in ARM are used for special purposes • R15 == PC (Program Counter) • x86 uses a terminology called IP (Instruction Pointer) • R14 == LR (Link Register) • R13 == SP (Stack Pointer)
32 32 32 ARM9 Register File • A set of architectural (programmer-visible) registers inside CPU is called register file • Register file can be implemented with flip-flops or SRAM • ARM9 register file has 16 32-bit registers • 3 read ports • 2 write ports • Register file access is much faster than main memory or cache because there are a very limited number of registers and they reside inside CPU • So, compilers strive to use the register file when translating high-level code to assembly code 4 4 4 4 4 Register File 32 bits src1 addr src1 data R0 R1 src2 addr src3 addr R2 R3 dst2 addr dst1 addr src3 data src2 data write1 data write2 data … 32 32 write2 write1 R14 R15
CPSR • Current Program Status Register (CPSR) is accessible in all modes • Contains all condition flags, interrupt disable bits, the current processor mode
CPSR bits • ARM: 32-bit mode • Thumb: 16-bit mode • Jazelle: Special mode for JAVA acceleration
ARM Instruction Format Arithmetic and Logical Instructions Memory Access Instructions (Load/Store) Branch Instructions Software Interrupt Instruction
ARM Instruction Fields 32-bit opcodeoperation code Rn 4-bitsfirstsource register Rm 4-bitssecondsource register Rs 4-bitsthirdsource register Rd 4-bitsdestination register shift 2-bitsshift type* shift amount 5-bitsshift by how many bits * Shift type: Arithmetic, logical (left, right)
Overview of ARM Operation • ARM arithmetic in assembly form add R3, R1, R5 # R3 = R1 + R5 • R1 and R5 are source operands, and R3 is destination • # indicate a comment, so assembler ignores it • Operands of arithmetic instructions come from special locations called registers inside CPU or from the immediate field in instructions • All CPUs (x86, PowerPC, MIPS, ARM…) have registers inside • Registers are visible to the programmers • ARM has a register file consisting of 16 registers
Simplified Version of CPU Internal addR3, R1, R5 # R3 = R1 + R5 ARM CPU Address Bus Registers 32 bits R0 R1 Memory R1 Data Bus R2 + R3 R3 add R3, R1, R5 … R5 R14 R15
ARM Processor Family Source: Wikipedia
NEON & VFP www.arm.com
Register Mapping • NEON Advanced SIMD and VFP use the same register set
NEON • Advanced SIMD (Single Instruction Multiple Data) • It supports 8, 16, 32 and 64-bit integer and single-precision (32-bit) floating point data • Up to 16 operations at the same time • 1B x 16 = 16B (= 1 quad word) http://en.wikipedia.org/wiki/ARM_architecture
VFP (Vector Floating Point) • FPU (Floating Point Unit) coprocessor extension to ARM architecture • Single-precision and double-precision FP computation • Compliant with IEEE 754-1985 • Intended to support execution of short “vector mode” instructions, but operated on “each” vector element sequentially • Thus, did not offer the performance of true SIMD • This vector mode was thus removed shortly after its introduction, to be replaced with the much more powerful NEON Advanced SIMD http://en.wikipedia.org/wiki/ARM_architecture
ARM Processor Selector www.arm.com