1 / 31

Progetto MAIS - WP5 esplorazione di architetture alternative Resoconto delle attività svolte

Progetto MAIS - WP5 esplorazione di architetture alternative Resoconto delle attività svolte. Andrea Pagni STMicroelectronics Advanced System Architectures Group Milano, 17-18 Novembre 2004. Topics. Part 1: VLIW-SIM Overview. Part 2: VLIW-SIM Performance. Part 3: VLIW-SIM Library.

Download Presentation

Progetto MAIS - WP5 esplorazione di architetture alternative Resoconto delle attività svolte

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Progetto MAIS - WP5esplorazione di architetture alternativeResoconto delle attività svolte Andrea Pagni STMicroelectronics Advanced System Architectures Group Milano, 17-18 Novembre 2004

  2. Topics • Part 1: VLIW-SIM Overview. • Part 2: VLIW-SIM Performance. • Part 3: VLIW-SIM Library. • Part 4: Next Steps.

  3. Part 1: VLIW-SIM Overview

  4. Part 1: VLIW-SIM Overview • Simulation Approach (1-7). • Modeled Target Architectures. • Supported platforms. • Simulation functionalities.

  5. Simulation Approach 1/7Overview • Interpretative Simulation Approach • Simulation Technology based on a set of re-usable sub-blocks • Pipeline modeling • Instruction execution • Memory modeling • Register file management • I/O simulation • Efficient Host Resources Allocation • Target Architecture Description capability (IS, TAD) • Challenging compromise between Speed and Accuracy

  6. Simulation Approach 2/7pipeline modelling During simulation, the pipeline is represented as a 3-dimensional space (phase, operation, time): operation means the instruction’s position in the bundle, phase is the pipeline’s phase and time is the given time stamp.

  7. Simulation Approach 3/7Pipeline modelling • The pipeline status is modelled via a two-dimension array: • The first index is the pipeline phase and the second one is the position of a certain instruction in the fetch-packet. • The simulation process is based on two arrays like the one described above, to represent the current and the following pipeline statuses. • At each machine cycle the pipeline status is processed: actions depending on which instructions are at that phase and then the instructions are moved to the next pipeline phase.

  8. Simulation Approach 4/7pipeline status update At each machine cycle the pipeline status is processed

  9. Simulation Approach 5/7Instruction execution Instructions execution is simulated through an Instruction Table which contains the instruction-routine address and the instruction latency value.

  10. Simulation Approach 6/7register file status update • The simulation environment is based on the progressive pipeline status updating taking into account the data coherence in memory locations and in the register file. • To support data coherence two Register files have been used: one for the current Register File status and the other one for the following. • Each time an instruction is executed its operands are loaded from the current register file and results are stored in the following. • This allows sequential simulation of parallel instruction execution.

  11. Simulation Approach 7/7I/O simulation Details • I/O Target Architecture specific features separated from Simulation kernel • The SYSCALL pseudo-instruction manages the interface between internal I/O instruction (processor side) and File System I/O calls (OS side). • SYSCALL handle also the general Exception Handling • This mechanism is transparent to other simulator modules: • Performance and data flow are not influenced if I/O operation are not present.

  12. Modeled Target Architectures TI C62x ST210 TI C64x • 8-issue VLIW core • I/D cache memories • 11-stages pipeline • RISC/SIMD Instruction Set • 64 32-bit General registers • 8-issue VLIW core • Optional I-cache memory • 11-stages pipeline • RISC-like Instruction Set • 32 32-bit General registers • Multi-cluster Architecture • 4-issue VLIW core • I/D-cache memories • 6-stages pipeline • RISC-like Instruction Set • 64 32-bit General registers, 8 1-bit special registers

  13. Supported Platforms • Windows OS (Visual C++): • text mode: project file in vliw_sim/vliw_sim • graphical mode: project file in vliw_sim/gui/gui • Windows OS (Cygwin, gcc): • text mode: makefile in vliw_sim/vliw_sim • graphical mode (with XWindows on Cygwin) • Linux OS (RedHat, gcc): • text mode: makefile in vliw_sim/vliw_sim • graphical mode: makefile in vliw_sim/gui/gui • Sun OS (Solaris, gcc) • text mode: makefile in vliw_sim/vliw_sim • graphical mode: makefile in vliw_sim/gui/gui • vliw_sim • bin_loader • cache • gui/gui • instruction_set • io_interf • memory • pipeline • profdebug • registers • vliw_sim • vliw_sim_dll

  14. Simulation functionalities • Debug Support • Step-by-step execution • Breakpoint • Register & Memory access • Pipeline Visibility (instruction & addresses) • Profiling Application • Code region Profile • Statistics extraction for profiled code • Simulator Dynamic Library • Simulation API • SoC simulation facilities • Exception Handling simulation • Efficient I/O interface simulation

  15. Part 2: VLIW-SIM Performance

  16. Part 2: VLIW-SIM Performance • Tested Applications. • SW apps on ST210. • SW apps on TI C62x. • SW apps on TI C64x. • SW apps on ST210 (1-2).

  17. Tested Applications • ST210. • MPEG-2 Intra Video Encoder (0.2s, 5 frames, 15 Mbit/s). • MPEG-1 Layer 2 Audio Encoder (1s, 32KHz 256 kbit/s). • MPEG-2 M=3 MP@ML Video Decoder (1s, 25 frames/s, 15 Mbit/s). • MPEG-4 QCIF SP@L3 Video Decoder (1s, 25 frames/s, 512 kbit/s). • MPEG-4 QCIF SP@L3 Video Encoder (27 frames, 64 kbit/s, QP=12). • H.263+ QCIF Video Encoder (10 frames, No rate-control). • G.723.1 Audio Enc-Dec (20 frames, 8 kHz, 5.3 kbit/s). • Automatic Speech Recognition (HMM, 5 words, 8 MEL, 50 active words). • TI C62x & C64x. • H.263+ Video Enc QCIF (5 frames, No rate-control) • G.726 Audio Enc-Dec (10 frames, 8kHz, 32 kbit/s)

  18. SW apps on TI-C62x • Operation = one syllable (elementary 32-bit RISC instruction)

  19. SW apps on TI-C64x • Bundle = more syllables (max 8 for TI C6xx, max 4 for ST210) per clock cycle

  20. SW apps on ST210 1/3 • HP ISS configured with: • ignore_non_cacheable_areas TRUE • profile_gprof_on FALSE

  21. SW apps on ST210 2/3 • HP ISS configured with: • ignore_non_cacheable_areas TRUE • profile_gprof_on FALSE

  22. SW apps on ST210 3/3 • MOPS = Millions Of Operations Per Sec

  23. Part 3: VLIW-SIM Library

  24. Part 3: VLIW-SIM Library • VLIW-SIM Library (1-2).

  25. VLIW-SIM Library1/2 • The VLIW-SIM can be configured as both stand-alone and dynamic library (DLL). • extremely useful to interface VLIW-SIM with other applications (system on chip simulation environment, Graphical User Interface, etc.). • The simulator-exported functionalities can be divided into two subgroups: • Command Functionalities: used to control the simulation (Run, Stop, Insert/remove breakpoint, Continue, Step, etc.) • Status Functionality: used to retrieve the simulator internal status and resource allocation (pipeline status and size, register file content and size, etc.)

  26. VLIW-SIM Library2/2 The simulator DLL exports the following functionalities: • Control Functions • Load • Init • Step / Step N / Stall • Run • Restart • Debug Support • View simulator status ( Pipeline, Register File, Memory ) • Breakpoint • Utility functions • Code profiling • Simulated Program Arguments

  27. Part 4: Next Steps

  28. Part 4: Next Steps • Where we are. • VLIW-SIM Developments.

  29. Where we are • Released version 2.0 and 3.0 of VLIW-SIM. • A lot of SW engineering work to improve: • Modularity • Readibility (doxygen generated documentation) • Simulation speed • Architectural accuracy: • ST210: IPU, DPU, Interrupt Controller, Core Memory Controller, I-cache, D-cache • TI C6x: I-cache and D-cache for CPU style , program memory and data memory for DSP style • Accurate and not invasive flat profiling (GNU format compatible) • Architectural flexible re-configurability • Host platform independency • Future integration into high level system tools

  30. VLIW-SIM developments • ST220 accurate modelling • Integration inside MaxSim system simulation tools and related experiments

  31. Fine Domande?

More Related