1 / 28

Multiclustered and Multithreaded Architecture

Explore the benefits and drawbacks of multithreading, types of architectures, and how simultaneous multithreading (SMT) improves CPU efficiency and power. Learn about cluster computing, grid computing, and the limitations of increasing processor performance. Discover the nuances of multithreading for optimizing system throughput and resource utilization.

briscoej
Download Presentation

Multiclustered and Multithreaded Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiclustered and Multithreaded Architecture

  2. Multithreading • The ability for a CPU to run multiple processes/threads at the same time, supported properly by the computer’s operating system. • Multithreading is a major way of increasing a system’s throughput, leading to gains in performance as a result. • Differs from Multiprocessing (another throughput-increasing method) in that all threads share the same set of resources. • Often used in conjunction with Multiprocessing: Multithreading optimizes utilization of a single core, while Multiprocessing runs multiple cores in concert with each other.

  3. Advantages • Processes can continue to utilize unused resources if one process stalls out • Maximizes usage CPU resources that would have been idle otherwise • If multiple threads are using the same data, sharing the same cache can lead to better usage of the cache as well as data synchronization

  4. Disadvantages • Potential exists for threads to interfere with each other when sharing hardware resources • Performance gains vary from system to system • Hand-crafted assembly programs can actually see performance degradation • Requires software support at both the operating system and application level to work properly

  5. Types • Temporal Multithreading (two main sub-categories that differ by their granularity) • Coarse-Grained • Fine-Grained (Interleaving) • Simultaneous Multithreading • Distinction between the two is how many threads can be at a given pipeline stage during a cycle: • Temporal: Allows only one thread per execution cycle • Simultaneous: Allows more than one per execution cycle

  6. Coarse-Grained architecture • When a thread is stalled due to some event, switch to a different hardware context. • CPU switches every few cycles to a different thread.

  7. Fine-Grained Architecture • Also called Cycle-by-Cycle Interleaved. • One core with separate sets of register to manage multiple threads • The core can make a context switch from one thread to another at every cycle. • When there is a long period of cache missed and the current thread is idle; you still be able to run another thread. • Tolerates the control and data dependency latencies by overlapping the latency with useful work from other threads

  8. Fine-Grained Architecture

  9. Simultaneous Multithreading( SMT ) • Used exclusively for increasing the efficiency of superscalar CPUs • Initially developed for use in IBM’s supercomputer project during the 1960’s • Allows multiple threads to issue instructions per CPU cycle • Enabled without major changes to a processor’s architecture: • Ability to accept instructions from multiple threads • Larger than normal register to accommodate the data from extra threads

  10. Simultaneous Multithreading( SMT )

  11. Simultaneous Multithreading (Cont.) • Advantages: • Increased processor performance (varies, see below) • Increased power efficiency • Cuts memory latency down to near unnoticeable levels • Disadvantages: • Can actually decrease performance depending on processor architecture if there are resource bottlenecks • Makes software development more difficult, as testing needs to be done to determine if the application benefits or suffers from the feature followed by logic to turn it off if necessary • Potential security issues with shared resources

  12. Multithreading architecture summary

  13. How do we increase computing power? • Increasing Performance: • A farmer seeks to increase performance of his ox and plow • Should the farmer try to breed a stronger ox?

  14. How do we increase computing power? • Increasing Performance:

  15. How do we increase computing power? • Increasing Performance: • Or should the farmer use more oxen yoked together?

  16. How do we increase computing power? • Increasing Performance: • Processors have become faster, smaller, and transistor-denser, but these advances will quickly diminish while production costs increase rapidly • Limitations of increasing Processor performance: • Transistor density limited by electromagnetic / heat interference • Cost increase per Performance increase diminishes, when compared to adding additional processors

  17. Cluster Computing • What is a cluster? • Commodity computers using customized operating systems, connected by network interconnects, managed by an application

  18. Cluster Computing • What is cluster computing used for? • Distributed computing: • A network of computers that communicate with each other to achieve a common goal • A job to be processed is split into tasks, and the tasks are processed by individual computers or nodes • Amdahl’s Law: every algorithm has a section that must be executed serially, this limits the speedup that can be achieved, through distributed computing

  19. Multicluster Architectures • Grid Computing: • Loosely coupled and geographically dispersed clusters • Generally used in scientific research by institutions • Utilize thousands to hundreds of thousands of processor cores spread across many institutions • Connected via Storage Area Network or SAN

  20. Multicluster Architectures • Grid Computing: Tommy Minyard, TACC

  21. Multicluster Architectures • Grid Computing Limitations: • Suitable for computationally intensive jobs, but ill-equipped for handling and transferring large amounts of data • SAN becomes a bottleneck, when large amounts of data must be transferred to multiple clusters

  22. Multicluster Architectures • Supercomputers and High Performance Computing (HPC): • Highly tuned computer clusters using commodity processors, with customized network interconnects and operating systems

  23. Multicluster Architectures • Supercomputers and High Performance Computing (HPC): • FLOPS: Floating-point Operations per second • Currently the fastest Supercomputers operate at peta-scale • Quadrillions of FLOPS or 1,000,000,000,000,000 (1015)

  24. Multicluster Architectures • China’s Supercomputer Sunway TaihuLight: 93 petaFLOPS (2016) = 93,000,000,000,000,000 FLOPS

  25. Multicluster Architectures • Hadoop Clusters for Big Data: • Data Locality: data is stored locally on the nodes themselves; very fast • Unlike grid architectures, there is no bottleneck in data transfer over SAN • Unlike RDBMS, Hadoop clusters stream through data at disk transfer rate, rather than using point queries at slower disk “seek” rate • 2008 – 1 TB sorted in 209 seconds using 900 nodes • 2009 – 100 TB sorted in 173 minutes using 3400 nodes

  26. Multicluster Architectures • Common Hadoop Cluster Networking scheme: • Higher latencybetween racks • Store data locally

  27. Multicluster Architectures • Hadoop Clusters for Big Data: • Fault tolerance • Large number of parts, increases the likelihood of hardware failure in the system • Hardware Redundancy: • Data and Task outputs replicated, three copies are made • Error Detection: Large quantities of data transferred, increases likelihood of data corruption in the system • CRC – 32 (cyclic redundancy check)

  28. Sources Xie, Maoyuan & Yun, Zhifeng & Lei, Zhou & Allen, Gabrielle. (2007). Cluster Abstraction: Towards Uniform Resource Description andAccess in Multicluster Grid. 220-227. 10.1109/IMSCCS.2007.79. Raicu, I. Introduction to Distributed Systems [slides]. (2011). Illinois Institute of Technology. White, T. Hadoop: The Definitive Guide, 3rd ed. (2012). Null, L., Lobur, J. The Essentials of Computer Organization and Architecture, 4th ed. (2015). Simultaneous Multithreading Project (Information Repository): https://dada.cs.washington.edu/smt/

More Related