710 likes | 949 Views
OpenRISC-Based Embedded System Design. April. 3, 2007. Dynalith Systems www.dynalith.com. Agenda. Introduction to OpenCores OpenRISC Architecture OpenIDEA Example. Introduction to OpenRISC. OpenRISC Free open-source synthesizable RISC processor
E N D
OpenRISC-Based Embedded System Design April. 3, 2007 Dynalith Systems www.dynalith.com
Agenda • Introduction to OpenCores • OpenRISC Architecture • OpenIDEA • Example
Introduction to OpenRISC • OpenRISC • Free open-source synthesizable RISC processor • Distributed by OpenCores, http://www.opencores.org • OpenRISC 1000 (or1k) • Target • medium and high performance networking and embedded computer environments • Architecture • 32/64-bit load/store RISC architecture • Designed with emphasis on performance, simplicity, low power requirements and scalability • Architecture Definition • Instruction set, register set, cache management & coherency, memory model, exception model, addressing mode, operands conventions, application binary interface • Not define implementation-specific details • Pipeline depth, cache organization, branch prediction, instruction timing, bus interface Reference: OpenRISC 1000 Architecture Manual
OR1200 Entry level 32bit RISC processor I/D Cache, I/D MMU, Tick timer, PIC, Debug Internet appliances, Networking, Handheld OR1100 Entry level 32bit DSP I Cache, Tick timer, PIC, Debug VoIP, Modems, Imaging OR1400 High performance superscalar 64bit RISC I/D Cache, I/D MMU, Tick timer, PIC, Debug, FPU, Vector/DSP Telecom, Home entertainment OR1500 A limited configuration SystemC implementation OpenRISC 1000 implementations OpenRISC 1000 architecture 32/64-bit OpenRISC 1200 32/-bit OpenRISC 1x00 OpenRISC 1 2 0 0 How the implementation is configured Which features are implemented OpenRISC 1000 family
An implementation of OR1K architecture 32-bit scalar RISC ORBIS32 instruction set Harvard micro-architecture Five-stage integer pipeline Virtual memory support (MMU) Separated IMMU and DMMU Separated I-cache and D-cache Including Components Debug unit Tick timer Programmable interrupt controller Power management Performance 250 MIPS Dhrystone 2.1 @ 250Mhz wc 250 MMAC operations @ 250Mhz wc <500mW or <1W @ 250Mhz, 0.18mm <0.4mm2 @ 0.18mm 6LM (excluding cache & memory) OpenRISC 1200
OpenRISC 1200 • L1 Caches • I/D Cache (1 to 8 KB) • 1-way direct-mapped cache • MMU • With 1-way direct-mapped TLB • Harvard model • TLB, 16 to 64 entries • Power Management • Power save modes • Software controlled clock frequency • Interrupt wake-up • Dynamic clock gating for individual units • Interrupt Controller • 30 maskable interrupt sources • Tick Timer • Task scheduling • Time measurement • Interrupt generation • Single-run, restartable or continuous mode • Debug Unit • JTAG Test Access Port • Non-intrusive Realtime debug/trace for both CPU and System • Accessible via development interface • Links into GDB Reference: OpenRISC 1200 IP Core Specification
Architecture Details • Register Set • Instruction Set • Exception Model • Memory Model • Memory Management • Cache • Debug Unit • Performance Counter Unit • Power Management • Timer/PIC • Application Binary Interface
Register Set • Register set • Thirty-two or sixteen 32/64-bit general purpose registers • All other registers are special purpose registers • User-level/supervisor-level register • Multiple sets of GPRs (not implemented in or1200) • Special-purpose registers • 32 groups, up to 2048 registers in a group • l.mtspr/l.mfspr instruction Special purpose registers Reference: OpenRISC 1000 Architecture Manual
Simple and uniform-length instruction format ORBIS32 (or1200) 32-bit wide, 32-bit boundary aligned, 32-bit data operation 32-bit integer instructions Basic DSP instructions 32-bit load/store instructions Program flow instructions Special instructions ORBIS64 32-bit wide, 32-bit boundary aligned, 64-bit data operation 64-bit integer instructions 64-bit load/store instructions ORFPX32 32-bit wide, 32-bit boundary aligned, 32-bit data operation Single-precision floating-point instructions ORFPX64 32-bit wide, 32-bit boundary aligned, 64-bit data operation Double-precision floating-point instructions 64-bit load/store instructions ORVDX64 32-bit wide, 32-bit boundary aligned, 8-, 16-, 32-, 64-bit data operation Vector instructions DSP instructions Reserved opcodes for custom instructions Instruction Set
Instruction Set prefix . instruction . postfix Precision/data width d : double s : single b : byte h : half-word n : nibble Instruction Instruction set l : ORBSIS lf : ORFPX lv : ORVDX
Exception Model • Reset vector • 0x100 • Bus error • Caused by a bus interface error • Bus error • Instruction/Data page fault • Caused by access to an invalid virtual address • Segmentation fault • Alignment • Caused by not aligned access • Bus error • Range • Caused by not available register access
Memory Model • Weakly ordered memory model • High performance memory system • Responsibility for strict access ordering on programmer • Memory synchronization instruction • l.msync : complete of all load/store operations before the RISC core continues • Or1200 implementation • Strongly ordered memory model • Atomicity • Atomic memory access instructions • l.lwa, l.swa • Or1200 implementation • Not intended for use in multiprocessor environments • No support for coherency between local data cache and caches of other processors or main memory • Write-through cache
Memory Management • Support for implementation specific size of physical address spaces up to 35 address bits (32 GByte) • Three different page sizes: • Level 0 pages (32 Gbyte; only with 64-bit EA) translated with D/I Area Translation Buffer (ATB) • Level 1 pages (16 MByte) translated with D/I Area Translation Buffer (ATB) • Level 2 pages (8 Kbyte) translated with D/I Translation Lookaside Buffer (TLB) • Address translation using one-, two- or three-level page tables • Powerful page based access protection with support for demand-paged virtual memory • Support for simultaneous multi-threading (SMT) • OR1200 implementation • Only level 2 paging is implemented.
Memory Management • OpenRISC 1000 Specification • 32-bit implementation • 64-bit implementation
Memory Management • OR1200 implementation • Level 3 paging implemented • 1-way direct mapped TLB
Cache • Cache • 1-way direct-map • Up to 8KB for each I/D cache • Cache control • Block prefetch • Block flush • Block invalidate • Block write-back • Block lock • Or1200 cache implementation • Write-through mode • No support for coherency • No prefetch • No support for cache line lock
Quick Memory • Quick memory • On-chip memory • Unified I&D memory
Debug Unit • Eight sets of debug value/compare registers • Match signed/unsigned conditions on • instruction fetch EA • load/store EA • load/store data • Combining match conditions for complex watchpoints • Watch-points can be counted by Performance Counters Unit • Watch-points can generate a breakpoint (trap exception) • Counting watch-points for generation of additional watch-points
Debug Unit • Registers • DVR (Debug Value Register) • DCR (Debug Control Register) • Compare target • Compare condition • DMR (Debug Mode Register) • WP/BP setting • Combination of conditions of DVRs • DWCR (Debug Watch-point Counter Register) • DSR (Debug Stop Register) • Core stop condition • DRR (debug Reason Register) • OR1200 implementation • Breakpoint but no watchpoint
Performance Counter Unit • Benefits • To improve performance by developing better application level algorithms, • To better optimized operating system routines • For improvements in the hardware architecture of these systems • To improve future OpenRISC implementations • To add future enhancements to the OpenRISC architecture. • To help system developers debug and test their systems. • Performance counter • Eight counters • Counting predefined events • Load/store/instruction fetch • I/D-cache miss • LSU/branch/instruction fetch/data dependency stall • I/D-TLB miss • Watch point • Not implemented in OR1200
Power Management • Slow down • Support 0~15 clock frequency level • Need external clock synthesizer • Power mode (dynamic clock gating, dynamic voltage scaling) • Normal mode • Doze mode • All disabled (clock gating) except tick timer and PIC • Enter normal mode by timer or interrupt • Sleep mode • All disabled (clock gating) and voltage down except PIC • Enter normal mode by interrupt • Suspend mode • All disabled (clock gating) and voltage down • Enter normal mode by reset • OR1200 • Power manager implemented • But no clock gating implemented
Custom Instruction • Reserved instructions for custom implementation • ORBIS32/64 • Eight instructions are reserved • l.cust1 ~ l.cust8 • ORFPX64 • Two instructions are reserved • lf.cust1.d, lf.cust1.s • ORVDX64 • Eight instructions are reserved • lv.cust1 ~ lv.cust8 • Custom instruction implementation • Add decode logic for the instruction (or1200_ctrl.v) • Add processing logic for the instruction (or1200_alu.v)
Outline • Software Development with OpenIDEA • Software Development • Architecture Simulation • Hardware Development with iNSPIRE-Lite • Hardware/Software Co-Verification with OpenIDEA • Verification Flow
Source Browser Code Editor Compiler Window Software Development
Software Development • Platform • Windows 2000/XP • Target Processor • or1200 • Code Editor • Source Browser • Syntax Highlighting • Syntax Checking • Block Indent/Dedent/Folding • Comment out/uncomment • Line Number • Find/Replace/Find in Files
Software Development • Compiler • gcc 3.4.4 • Utilities • binutil 2.16.1 • make • bin2c • bin2hex • bin2flash • bin2srec • Newlib • 1.10.0
Software Development • Startup Code • Link Script File
Source-Level Debugging Register Watch Stack Source Browser Code Debugger Break Point
Source-Level Debugging • Debugger • gdb 5.0 • C Source-Level Debugging • Break • Step • Watch, etc. • Assembly-Level Debugging • Instruction Step • Register View, etc.
Architecture Simulation • Processor Architecture Exploration • Core • MMU • Cache, etc. • System-Level Simulation with Peripheral Models • Memory • UART • Ethernet • VGA, etc. • Performance Profiling • Execution Log • Memory Profile, etc.
Agenda • Software Development with OpenIDEA • Hardware Development with iNSPIRE-Lite • Automatic Hardware Composition • Open-Source Library • Simulation/Synthesis/Prototyping • Hardware/Software Co-Verification with OpenIDEA • Verification Flow
iNSPIRE • Integrated Design Environment for Hardware Development • Architecture Exploration • Generation of • HDL Simulation Environment • SystemC Simulation Environment • Synthesis & FPGA Mapping • Cycle-Level & Transaction-Level Co-Simulation Environment • Supporting Various Library
Architecture-Wizard • LEGO-Brick-Like Hardware Composition • GUI-Based Hardware Composition • Various IP Library • Open Source IP • Architecture Exploration • IP Properties
Architecture-Wizard • Bus Generation • AMBA AHB • AMBA APB • WISBHBONE • Bus Architecture Exploration • Bus Architecture • Address Map • Priority, etc.
IP Library • OpenCores Library License (L)GPL • OR1200 • OR1200 Debug • Audio • Video • Ethernet • CAN • DMA • PS2, etc. • Dynalith Library • FLASH Controller • SRAM Controller • JTAG to USB, etc.
IP Library • RTL Source • Synthesis Script • Example • Hardware • Software
Simulation/Synthesis • Simulation Environment • Top Module Generation • Simulation Model Connection • SRAM, SDRAM, UART, etc. • Simulation Script • Synthesis • Top Module Gereration • Synthesis Script • Synthesis Assist • FPGA P&R • FPGA Mapping Script • P&R Assist Simulation Synthesis Emulation
Agenda • Software Development with OpenIDEA • Hardware Development with iNSPIRE-Lite • Hardware/Software Co-Verification with OpenIDEA • Debugger + ISS • Debugger + ISS + HDL Simulator (SystemC) • Debugger + ISS + HDL Simulator + FPGA • Debugger + HDL Simulator (+ FPGA) • Debugger + FPGA Prototyping • Debugger + ASIC • Verification Flow
1. SW Simulation • OpenIDEA (Debugger+ISS) • Software Development
2.1 HW/SW Co-Simulation (IP Verification) • OpenIDEA (Debugger+ISS) + Third Party HDL Simulator • IP Verification • Device Driver Development