210 likes | 317 Views
Real-Time Address Trace Compression for Emulated and Real System-on-Chip Processor Core Debugging Bojan Mihajlovi´c , Željko Žili´c McGill University Dept. of Electrical and Computer Engineering Montreal, Quebec, Canada GLSVLSI’11, May 2–4, 2011. Presenter: Shao -Jay Hou. Abstract.
E N D
Real-Time Address Trace Compression for Emulated and Real System-on-Chip Processor Core DebuggingBojanMihajlovi´c, ŽeljkoŽili´cMcGill UniversityDept. of Electrical and Computer EngineeringMontreal, Quebec, CanadaGLSVLSI’11, May 2–4, 2011 Presenter: Shao-Jay Hou
Abstract • In the multicore era, capturing execution traces of processors is indispensable to debugging complex software. The inability to transfer vast amounts of trace data off-chip without significant slow-down has impeded the debugging of such software, in both pre-silicon emulation and in real designs. We consider on-chip trace compression performed in hardware to reduce data volume, using techniques that exploit inherent higher-order redundancy in address trace data. While hardware trace compression is often restricted to poor or moderate performance due to area and memory constraints, we present a parameterizable scheme that leverages the re- sources already found on existing platforms. Harnessing resources such as existing trace buffers on CPUs, and unused embedded memory on FPGA emulation platforms, our trace compression scheme requires only a small additional hardware area to achieve superior compression ratios.
What’s the problem? • MPSoCs multi-threaded program • Traditional debug method can’t be use • Non-invasive method is a good way(on-chip emulation) • immense amount of data that must be either stored on-chip or transferred off-chip in real-time • trace of a 32-bit processor, 1 clock per instruction, 100 MHz 400 MB/s data • Data need to be compressed
Related work This Paper Some example tools Trace compression schemes Compression methods Compression algorithms[5] Lempel-Ziv(LZ) [18] Multi-stage compression [11] DMTF [17] Combine MTF and LZ [1] ARM ETM[2] MCDS[12]
Consecutive Address Elimination • Why? • instructions consecutively until a branch is reached • Branch target address • How? • Divided into two part • address • length • Example:
Finite Context Method • Why? • Branch will be taken or not taken • Sequential locality • How? • similar to a cache • miss the first time a set of instructions is encountered • hit for every subsequent encounter that matches the prediction
Move-to-Front & Address Encoding • Why? • MTF • Increase the relevance • Prefix • Assist for differential compression • How? • Input address and predicted address • Differential compression
Run-length and Prefix Encoding • Why? • Prefix byte compression • Probability of prefix • How? • Huffman encoding
Data Stream Serializer • Why? • The input for data form MTF/AE stage is 5bytes • But the output to LZ stage is 1byte • How? • Use a little buffer to save
Lempel-Ziv Encoding of Data Stream • Why? • The input data has high Repeatability • How? • Use LZ compression • Create a dictionary to save the repeat part • But don’t output the dictionary • While decompression, create a same dictionary • Don’t output every cycle
Experimental Results • Benchmark : Mibench • CPU: Apple PowerMac G4 with a 1.25 GHz PowerPC 7455, 32-bit fixed instruction-length processor, Linux SMP kernel 2.6.32-24. • Simulation software: ModelSimSE-64 v6.5c
Experimental Results(cont.) • Logic utilization • Usage Scenario • JTAG • software fault 10-3
Conclution • This paper presented a parameterizablemicroarchitecturefor address trace compression, suited to implementation on ASICs and modern FPGAs. • Better compression ratio to others
My comment • The paper use a dictionary base, multi-stage compression method, can be use to improve our tracer. • The paper give a inspiration for future work for our tracer CPU GPU P.T. P.T. Bus T.M. B.T.