1.41k likes | 2.02k Views
Andes ( 晶心科技 ) SoC Development Solution Training Course ( For University ). Outline. ANDES 自主研發處理器簡介 ADP-XC5FF76 Evaluation Board 介紹 AndeScore 指令集架構 AndeSight 整合開發環境操作介紹 嵌入式軟體程式設計原理 Hello World GPIO 控制原理 SUM 控制原理 MP3 ADP-XC5FF76 Evaluation Board Totally Labs. 使用 AndESLive 開發數位相框之參考設計.
E N D
Andes (晶心科技) SoC Development Solution Training Course(For University)
Outline • ANDES自主研發處理器簡介 • ADP-XC5FF76 Evaluation Board 介紹 • AndeScore指令集架構 • AndeSight整合開發環境操作介紹 • 嵌入式軟體程式設計原理 • Hello World • GPIO控制原理 • SUM控制原理 • MP3 • ADP-XC5FF76 Evaluation Board Totally Labs. • 使用AndESLive開發數位相框之參考設計 ANDES Confidential
ANDES自主研發處理器簡介 ANDES Confidential
Introduction • What are embedded systems? • Challenges in embedded system design. • Design methodologies. ANDES Confidential
Embedding a computer output analog input CPU analog mem embedded computer ANDES Confidential
Embedding a computer (cont.) Examples • Personal digital assistant (PDA). • Printer. • GPS • Cell phone. • Automobile: engine, brakes, etc. • Television. • Household appliances. ANDES Confidential
BT Stereo HS Media Center PC BT Keyboard& Mouse Digital CableReady TV Media Player Cable STB DVD+PVR PC MP3Player Game System HDTV Camera Printer Media Phone IP Home Stereo 802.11Router Network Storage MobilePhone Notebook PC VoIP Phone @ MAC MCU Connect Your Life ConsumerElectronics Domain ` Internet PC Domain ANDES Confidential
Characteristics of Embedded Systems • Sophisticated functionality. • Real-time operation. • Low manufacturing cost. • Low power. • Designed to tight deadlines by small teams. ANDES Confidential
Design methodologies • A procedure for designing a system. • Understanding your methodology helps you ensure you didn’t skip anything. • Compilers, software engineering tools, computer-aided design (CAD) tools, etc., can be used to: ANDES Confidential
Target SW Compiler Assembler/Linker Debugger Tool chains Andes Virtual Platform Your Virtual SoC Application Models Essential IP’s Models AndeScore Customer SoC Evaluation Board Application IPs Essential IPs AndeScore SoC Customer SoC Add AICE™, ADP-AG101™, and ADP™-XC5 in v1.3.1 SoC Development Flow Andes/Partners’ solution Customers’ Design SW SoC Definition HW High Level Modeling Logic Design ANDES Confidential
Summary • Embedded computers are all around us. • Many systems have complex embedded hardware and software. • Embedded systems pose many design challenges: design time, deadlines, power, etc. • Design methodologies help us manage the design process. ANDES Confidential
Overview of Andes Technology Andes Highlights • Founded in 2005 March • First tier investors and partners (Government VC, MediaTek, and Faraday) • USD$20M capital for financial stability Andes’ Mission • Provide the best processor-based SoC solution Market Opportunities • The demand of multi-standard and multi-functions for different applications due to the device convergence of consumer electronics • The BRICs demand a big volume for low cost products • Fast growing market in Asia, world-wide IC designs move to Asia ANDES Confidential
Andes’ Main Lines of Business AndeStar™ Andes 16/32-bit Mixable ISA AndesCore™ CPU Core Family AndESLive™ ESL Integrated Virtual Environment Andes Embedded™ AndeShape™ SoC + EVB + ICE AndeSight™ Integrated Development Environment AndeSoft™ Optimized Target SW such as Linux/RTOS, Middleware, and Application Software. ANDES Confidential
Feature Set • AndeSight™ • Coder • Debugger • Profiler • Target Manager • Toolchains • Complier • Assembler • Linker • Debugger • AndESLive™ • Pre-built Models of AndeScore™ • Per-built peripheral IPs and Bus • Virtual SoC Builder • Visibility of debugging • Simulation of I/O devices ANDES Confidential
AndesCoreTM Market Segments • MID/Netbook • MFP • Networking • Gateway/Router • Home entertainment • Smartphone/Mobile phone High-end N12 series • Portable audio/media player • DVB/DMB baseband • DVD • DSC • Toys, Games Mid-range N10 Series • MCU • Storage • Automotive control • Toys Low-end N9 Series ANDES Confidential
AndesCore™ – Configurable Options • Instruction extensions: • Audio extensions • Performance extensions • Floating co-processor • String processing acceleration • User-defined extensions • Debugging support: • Embedded Debug Module with HW breakpoints • Embedded Program Tracer • Embedded performance monitor • Core: • Big/little endian • Static/Dynamic branch prediction • BTB size: 32/64/128/256 entries • 2/3 nested interrupt levels • 16/32 GPRs • 2R1W/3R2W register file • Cache: • Instruction queue size: 2/4/8 • 8KB ~ 64KB, 1/2/4 ways • 16B/32B cache line size • Replacement policy: Pseudo LRU or random • Local Memory: • Internal or external, 4KB ~ 1MB • Memory Management • Simplest 2/4 partitions • MPU with 8 segments • MMU • microTLB size: 4/8 entries • mainTLB size: 32/64/128 entries • Page table walking: hardware or software • Bus interfaces: • AHB/AHB-Lite/APB/AMI • HSMP bus ANDES Confidential
JTAG/EDM N9 uCore Instr LM/IF Instr Cache Data Cache Data LM/IF External Bus Interface APB/AHB/AHB-Lite/AMI N903: Low-power Cost-efficient Embedded Controller • Features: • Harvard architecture, 5-stage pipeline. • 16 general-purpose registers. • Static branch prediction • Fast MAC • Hardware divider • Fully clock gated pipeline • 2-level nested interrupt • External instruction/data local memory interface • Instruction/data cache • APB/AHB/AHB-Lite/AMI bus interface • Power management instructions • 45K ~ 110K gate count • 250MHz @ 130nm • Applications: • MCU • Storage • Automotive control • Toys ANDES Confidential
N903 Competition *TSMC free library with max speed synthesis constraint ANDES Confidential
N1033A: Lowe-power Cost-efficient Application Processor • Features: • Harvard architecture, 5-stage pipeline. • 32 general-purpose registers • Dynamic branch prediction • Fast MAC • Hardware divider • Audio acceleration instructions • Fully clock gated pipeline • 3-level nested interrupt • Instruction/Data local memory • Instruction/Data cache • DMA support for 1-D and 2-D transfer • AHB/AHB-Lite/APB bus • MMU/MPU • Power management instructions • Applications: • Portable audio/media player • DVB/DMB baseband • DVD • DSC • Toys, Games ANDES Confidential
N1033A Competition *TSMC free library with max speed synthesis constraint ANDES Confidential
JTAG/EDM EPT I/F N12 Execution Core ITLB DTLB MMU Instruction Cache Instruction LM Data LM Data Cache DMA External Bus Interface AHB HSMP N1213 – High Performance Application Processor • Features: • Harvard architecture, 8-stage pipeline. • 32 general-purpose registers • Dynamic branch prediction. • Multiply-add and multiply-subtract instructions. • Divide instructions. • Instruction/Data local memory. • Instruction/Data cache. • MMU • AHB or HSMP(AXI like) bus • Power management instructions • Applications: • Portable media player • MFP • Networking • Gateway/Router • Home entertainment • Smartphone/Mobile phone ANDES Confidential
N1213 Competition *TSMC free library with max speed synthesis constraint ANDES Confidential
Pipeline Overview ANDES Confidential
Computer architecture taxonomy • von Neumann architecture ANDES Confidential
Computer architecture taxonomy (cont.) • Harvard architecture address CPU data memory PC data address program memory data ANDES Confidential
8-stage pipeline ANDES Confidential
Instruction Fetch Stage • F1 – Instruction Fetch First • Instruction Tag/Data Arrays • ITLB Address Translation • Branch Target Buffer Prediction • F2 – Instruction Fetch Second • Instruction Cache Hit Detection • Cache Way Selection • Instruction Alignment IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential
Instruction Issue Stage • I1 – Instruction Issue First / Instruction Decode • 32/16-Bit Instruction Decode • Return Address Stack prediction • I2 – Instruction Issue Second / Register File Access • Instruction Issue Logic • Register File Access IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential
Execution Stage • E1 – Instruction Execute First / Address Generation / MAC First • Data Access Address Generation • Multiply Operation (if MAC presents) • E2 –Instruction Execute Second / Data Access First / MAC Second / ALU Execute • ALU • Branch/Jump/Return Resolution • Data Tag/Data arrays • DTLB address translation • Accumulation Operation (if MAC presents) • E3 –Instruction Execute Third / Data Access Second • Data Cache Hit Detection • Cache Way Selection • Data Alignment IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential
Write Back Stage • E4 –Instruction Execute Fourth / Write Back • Interruption Resolution • Instruction Retire • Register File Write Back IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential
Instruction Fetch Unit F1 – Instruction Fetch First • Instruction Tag/Data Arrays • ITLB Address Translation • Branch Target Buffer Prediction F2 – Instruction Fetch Second • Instruction Cache Hit Detection • Cache Way Selection • Instruction Alignment ANDES Confidential
Branch Prediction Overview • Why is branch prediction required? • A deep pipeline is required for high speed • Increasing the number of stages between fetch and branch resolution increases the taken-branch penalty • Prediction allows the penalty to be avoided in the majority of cases • Why dynamic branch prediction? • Static branch prediction requires knowledge of the type of branch and the target address before a prediction can be made • This information is not available before the decode stage and this would still increase the penalty for all branches • Dynamic branch prediction is performed at the instruction fetch stage based purely on fetch addresses – no knowledge of the incoming instructions is required ANDES Confidential
Branch Prediction Unit • Branch Target Buffer (BTB) • 128 entries of 2-bit saturating counters • Strongly-taken, Weakly-taken, Weakly-not-taken, Strongly-not-taken • 128 entries, 32-bit predicted PC and 26-bit address tag • Call-return and alignment flags • Return Address Stack (RAS) • Four entries • BTB and RAS updated by committing branches/jumps ANDES Confidential
BTB Instruction Prediction • BTB predictions are performed based on the previous PC instead of the actual instruction decoding information, BTB may make the following two mistakes • Wrongly predicts the non-branch/jump instructions as branch/jump instructions • Wrongly predicts the instruction boundary (32-bit -> 16-bit) • If these cases are detected, IFU will trigger a BTB instruction misprediction in the I1 stage and re-start the program sequence from the recovered PC. There will be a 2-cycle penalty introduced here ANDES Confidential
RAS Prediction • When return instructions present in the instruction sequence, RAS predictions are performed and the fetch sequence is changed to the predicted PC. • Since the RAS prediction is performed in the I1 stage. There will be a 2-cycle penalty in the case of return instructions since the sequential fetches in between will not be used. ANDES Confidential
Branch Miss-Prediction • In N12 processor core, the resolution of the branch/return instructions is performed by the ALU in the E2 stage and will be used by the IFU in the next (F1) stage. In this case, the misprediction penalty will be 5 cycles. ANDES Confidential
Cache ANDES Confidential
Cache and CPU address data cache main memory CPU cache controller address data data ANDES Confidential
Cache operation • Many main memory locations are mapped onto one cache entry. • May have caches for: • instructions; • data; • data + instructions (unified). ANDES Confidential
Multiple levels of cache L2 cache CPU L1 cache ANDES Confidential
Replacement policy • Replacement policy: strategy for choosing which cache entry to throw out to make room for a new memory location. • Two popular strategies: • Random. • Least-recently used (LRU). ANDES Confidential
Write operations • Write-through: immediately copy write to main memory. • Write-back: write to main memory only when location is removed from cache. ANDES Confidential
Improving Cache Performance • Goal: reduce the Average Memory Access Time (AMAT) • AMAT = Hit Time + Miss Rate * Miss Penalty • Approaches • Reduce Hit Time • Reduce or Miss Penalty • Reduce Miss Rate • Notes • There may be conflicting goals • Keep track of clock cycle time, area, and power consumption ANDES Confidential
Tuning Cache Parameters • Size: • Must be large enough to fit working set (temporal locality) • If too big, then hit time degrades • Associativity • Need large to avoid conflicts, but 4-8 way is as good a FA • If too big, then hit time degrades • Block • Need large to exploit spatial locality & reduce tag overhead • If too large, few blocks ⇒ higher misses & miss penalty Configurable architecture allows designers to makethe best performance/cost trade-offs ANDES Confidential
Cache configuration • Cache line per way • 128/256/512/1024 • Cache ways • 2/4 ways • Cache line size • 16B/32B • Cache size combination • 8KB/16KB/32KB/64KB • Replacement policy • Pseudo LRU (default) • 3-BIT per cache line • Random • 2-bit pre cache line ANDES Confidential
Cache control— CCTL instruction • I cache control • Fill and lock • Unlock • Invalidate • Read/write tag • Read/write word data • D cache control • Invalidate • Write back • Read/write tag • Read/write word data ANDES Confidential
Cache data flow I-Cache I Cache refill I Fetches Uncached Instruction/data CPU Ext Memory Uncached write/write-through Write back Load & Store D-Cache D-Cache refill ANDES Confidential
Memory Management Units (MMU) ANDES Confidential
MMU Functionality • Memory management unit (MMU) translates addresses logical address memory management unit physical address CPU ANDES Confidential
MMU Functionality • Virtual memory addressing • Better memory allocation, less fragmentation • Allows shared memory • Dynamic loading • Memory protection (read/write/execute) • Different permission flags for kernel/user mode • OS typically runs in kernel mode • Applications run in user mode • Cache control (cached/uncached) • Accesses to peripherals and other processors needs to be uncached. ANDES Confidential