170 likes | 512 Views
Single-Chip Multi-Processors (CMP). PRADEEP DANDAMUDI. Microprocessor. Methods To Increase Performance: The number of transistors available has a huge effect on the performance of a processor. More transistors also allow for a technology called pipelining . Parallelism.
E N D
Single-Chip Multi-Processors (CMP) ELEC6200-001, Fall 08 PRADEEP DANDAMUDI
ELEC6200-001, Fall 08 Microprocessor Methods To Increase Performance: • The number of transistors available has a huge effect on the performance of a processor. • More transistors also allow for a technology called pipelining. • Parallelism
ELEC6200-001, Fall 08 Parallelism in Microprocessors • Pipelining is most prevalent • Used in everything • Even microcontrollers • Decreases cycle time • Allows up to 1 instruction per cycle (IPC) • No programming changes • Some Pentium 4s have more than 30 stages! • Parallelism classifications: Instruction level Loop level Thread level - Future trend Process level - Future trend
ELEC6200-001, Fall 08 Instruction Level Parallelism (ILP) • Competing technology - Superscalar • Executing multiple instruction in the same clock cycle. • Dynamic scheduling-ability to execute out of program order. • Single processor • Replace ALU with multiple functional units • Dispatch several instructions at once
ELEC6200-001, Fall 08 Superscalar pipeline
ELEC6200-001, Fall 08 Competing technologies • Simultaneous Multi Threading • Simultaneous Multi threading architecture is similar to that of the superscalar. • SMT processors support wide superscalar processors with hardware, to execute instructions from multiple thread concurrently. • Out-of-Order Execution • Where instructions execute in any order that does not violate data dependencies. • Note that this technique is independent of both pipelining and superscalar
ELEC6200-001, Fall 08 Centralized architecture • Disadvantages of centralized architectures such as SMT and Superscalars are: - Area increases quadratically with core’s complexity. - Increase in cycle time – interconnect delays. Delay with wires dominate delay of critical path of CPU. Possible to make simpler clusters, but results in deeper pipeline and increase in branch misprediction penalty. - Design verification cost high, due to complexity and single processor - Large demand on memory system.
ELEC6200-001, Fall 08 Why Multiprocessor Systems? • Single-core microprocessor performance increases are beginning to slow [1] due to: • Increasing power consumption (>100 W) • Increasing heat dissipation • Diminishing performance gains from ILP & TLP • As a result manufactures are turning to a multi-core microprocessor approach • Multiple smaller energy efficient processing cores are integrated onto a single chip • Improves overall performance by performing more work concurrently • The latencies associated with chip-to-chip communication disappear, Shared data structures are much less of a problem.
ELEC6200-001, Fall 08 Case for single chip multiprocessors • Advances in the field of integrated chip processing. - Gate density (More transistors per chip) - Cost of wires • Large uniprocessors are no longer scaling in performance, because it is only possible to extract a limited amount of parallelism from a typical instruction stream using conventional superscalar instruction issue techniques.
ELEC6200-001, Fall 08 CMP Architectures • Two general types of multi-core or chip multiprocessor (CMP) architectures • Homogeneous CMPs – all processing elements (PEs) are the same • Heterogeneous CMPs – comprised of different PEs • Homogenous dual-core processors for PCs are now available from all major manufactures • Heterogeneous CMPs are available in the form of multiprocessor systems-on-chips (MPSoCs)
ELEC6200-001, Fall 08 Single chip Multiprocessor architecture
ELEC6200-001, Fall 08 CMP Advantages • CMPs have several advantages over single processor solutions • Energy and silicon area efficiency • By Incorporating smaller less complex cores onto a single chip • Dynamically switching between cores and powering down unused cores [5] • Increased throughput performance by exploiting parallelism • Multiple computing resources can take better advantage of instruction, thread, and process level parallelism
ELEC6200-001, Fall 08 Summary • The CMP architecture is now the architecture of choice for semiconductor manufactures • CMPs are more area and energy efficient than single processor solutions • CMPs achieve greater throughput than single processor solutions as more work can be done concurrently • Custom multi-processor systems can now be designed and simulated from the ground up using software solutions from several companies
ELEC6200-001, Fall 08 References • http://www.morganclaypool.com/doi/abs/10.2200/S00093ED1V01Y200707CAC003 • L Hammond, BA Nayfeh, K Olukotun, “A Single-Chip Multiprocessor,” IEEE, Sept 1997. http://occs.ieee.org/presentations/2007/070122_Jenks_ParallelMicroprocessors.pdf • Chip Multiprocessor (CMP) Architectures ,web.cecs.pdx.edu/~mperkows/CAPSTONES/DSP1/ELG6163_Burton.ppt • en.wikipedia.org/wiki/