1 / 14

The potential for Software-only thread-level speculation

The potential for Software-only thread-level speculation. Questions and Answers Co-Supervisors: Prof. Greg. Steffan Prof. Cristina Amza Committee Members Prof. Tarek. Abdelrahman   Prof. Michael Voss Prof. Ken Sevick By: Chuck (Chengyan) Zhao April 25, 2005.

elga
Download Presentation

The potential for Software-only thread-level speculation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The potential for Software-only thread-level speculation Questions and Answers Co-Supervisors: Prof. Greg. Steffan Prof. Cristina Amza Committee Members Prof. Tarek. Abdelrahman   Prof. Michael Voss Prof. Ken Sevick By: Chuck (Chengyan) Zhao April 25, 2005

  2. Refresh and update on research • Tier-1: software TLS groups • U. Edinburgh Cintra’s group • … … • Tier-2: HW + SW TLS groups • Stanford Hydra • CMU Stampede • UIUC IO-COMA • Wisconsin Multiscalar • Purdue Multiplex • UMN WaveScalar • … … where to update knowledge and information

  3. Refresh and updates on research (cont) • Tier-3: optimizing/parallelizing compiler research groups • Stanford SUIF: • http://suif.stanford.edu • UIUC Impact: • http://www.crhc.uiuc.edu/Impact/ • Rice Parallel Compiler • http://cohesion.rice.edu/engineering/computerscience/research.cfm?doc_id=4355 • UIUC Polaris • http://polaris.cs.uiuc.edu/newhome/ • Rutgers Prolangs: • http://www.prolangs.rutgers.edu/ • McGill Sable: • http://www.sable.mcgill.ca/ • … … where to update knowledge and information

  4. Highly relevant conferences, journals, workshops... • ACM SAC: ACM Symposium on Applied Computing • http://www.acm.org/conferences/sac/ • PPOPP: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming • http://www.acm.org/sigs/sigplan/ppopp.htm • OOPSLA: object-oriented programming, systems, languages and applications • http://oopsla.acm.org/ • POPL: ACM Principles of Programming Languages (POPL) • http://www.acm.org/sigs/sigplan/popl.htm • SIG CSE: Special Interest Group Computer Science Education • http://www.sigcse.org/ • PLDI: ACM Programming Language Design and Implementation (PLDI) • http://www.acm.org/sigs/sigplan/pldi.htm • LCPC: International Workshop on Languages and Compilers for Parallel Computing • http://www.ecn.purdue.edu/LCPC2004/ conference, journals, workshops,

  5. Highly relevant conferences, journals, workshops.. • PACT: International Conference on Architectures and Compilation Techniques • http://www.laria.u-picardie.fr/~cerin/pact/ • ISCA: Annual Symposium on Computer Architecture • http://www.cs.wisc.edu/~isca2005/ • ASPLOS: International Conference on Architectural Support for Programming Languages and Operating Systems • http://www.eecg.toronto.edu/asplos2004/ • CC: International Conference on Compiler Construction • http://cc05.cs.berkeley.edu/ • SOSP: ACM Symposium on Operating System Principles • http://portal.acm.org/browse_dl.cfm?idx=SERIES372 • OSDI: Symposium on Operating System Design and Implementation • http://www.usenix.org/events/osdi04/ • ICSE: International Conference on Software Engineering • http://www.icse-conferences.org/ conference, journals, workshops (more)

  6. Highly relevant conferences, journals, workshops... • POPL: Symposium on Principles of Programming Languages • http://www.cs.princeton.edu/~dpw/popl/05/ • IPDPS: International Parallel and Distributed Processing Symposium • http://www.ipdps.org/ • ICS: International Conference on Supercomputing • http://ics05.csail.mit.edu/ • ICDE: International Conference on Data Engineering • http://icde2005.is.tsukuba.ac.jp/ • HPCA: International Symposium on High Performance Computer Architecture • http://www.hpcaconf.org/hpca11/ • ICPP: International Conference on Parallel Processing • http://dynamo.ecn.purdue.edu/~hankd/Tutorials/icpp96prog.html conference, journals, workshops (more)

  7. Highly relevant conferences, journals, workshops... • IEEE Computer • http://www.computer.org/computer/ • Super Computer: • http://www.sc-conference.org/sc2005 • IJPP: International Journal of Parallel Programming • http://www.springeronline.com/sgw/cda/frontpage/0,11855,5-40012-70-35620211-0,00.html • Computer Science Education conference, journals, workshops (more)

  8. Amdahl's Law • Amdahl's law: • the performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used. • Amdahl’s law: ideal case • If a program spend f % of time in un-parallelizable code, then its maximal parallel speed up would be 1/f • Amdahl’s law: realistic case • If a program spend f % of time in un-parallelizable code, then its maximal parallel speed up would be Amdahl's’ law: basic + extended

  9. More on Cell: Sony, Toshiba + IBM details of Cell Processor

  10. Cell technical details • PPE: PowerPC Processing Engine • Control processor • In order, 2-way SMT • 64k L1 cache, 512k L2 cache • Small, fast, efficient core • Support VMX ISA extension (altivec) • SPE: Synergistic Processing Engine • SIMD vector processor • 128 bit per register, 128 registers • 256K L1 cache-like SRAM (local memory) • dual issue, • No branch predictor (pure software branch prediction) • No virtual memory • No cache details of Cell Processor

  11. Cell technical details (cont) • SPE connections • Multi-level Ring connections • Overall • Size and complexicity similar to Intel’s P4-D processors • Build-in L2 cache • L1 cache on each core • On-die memory controller: • Dual channel, XDR • 25.6 GB/s • On-die IO controller: • Flex IO interface • 76.8 GB/s • Transistor count • 234 million (vs. 230 million of P4 Celeron D) details of Cell Processor

  12. Potential future directions on software TLS Software Check Pointing Compiler for TLS Hardware Debugging, security, Fault tolerance Transactional memory SW backup, HW dependence tracking Software Automatic TLS Parallelization where could my research go in the future

  13. How to use billion transistors on a chip • Wide-issue superscalar uni-processor • Simultaneous Multi-threading processor (SMT) • Chip Multi-Processor (CMP) • Speculative superscalar uni-processor • Distributed processor • Vector IRAM processor • RAW multi-processor possible future processor path (back in 1997)

More Related