300 likes | 1.03k Views
Hyper – Threading Technology. Group 1:James Juan Mustaali Raghu Sumanth. Introduction-a Few Buzzwords. Process Context Thread Context switches - fooling the processes. Single Threaded CPU. Each color represents a running program
E N D
Hyper –Threading Technology Group 1:James Juan Mustaali Raghu Sumanth
Introduction-a Few Buzzwords • Process • Context • Thread • Context switches - fooling the processes
Single Threaded CPU Each color represents a running program White spaces represent pipeline bubbles Each running program shares the RAM with other running programs Each program waits for its slice of CPU time in order to execute
A second CPU is added System executes two processes simultaneously Number of empty execution slots also gets doubled !!!! Single Threaded SMP
Super Threading Threads are executed simultaneously. Each processor pipeline stage can contain instructions for one and only one thread. Helps immensely in hiding memory access latencies. Does not address the waste associated with poor instruction-level parallelism within individual threads.
Hyper -Threading Two or more logical processors Allows the scheduling logic maximum flexibility to fill execution slots Hyper-threaded system uses a fraction of the resources and has a fraction of the waste of the SMP system
Hyper Threading – A timeline • Year 1995 , A seminal paper : Simultaneous Multithreading: Maximum On-Chip Parallelism by Dean M. Tullsen, Susan J Eggers and Henry M Levy at the University of Washington • Year 1997, Digital Equipment Corporation (DEC) along with the group from University of Washington was working on a project but in 1997 Intel licensed the patents for this technology and hired most of the guys who were working on the project as part of a large legal settlement between these companies • Year 2002,Hyper threading is implemented in Intel® Xeon ™ Server processor • Year 2003, Hyper threading makes it way to the Desktop Processor, Intel® Pentium® 4
Implementing Hyper-threading • Replicated Register renaming logic, instruction pointer, ITLB, return stack predictor, Various other architectural registers • Partitioned Re-order buffers,load/store buffer, various queues :scheduling queue,uop queue • Shared Caches:Trace Cache, L1,L2,L3, Micro-Architectural registers , Execution Units
Replicated Resources • Necessary in order to maintain two fully independent contexts on each logical processor. • The most obvious of these is the instruction pointer (IP), which is the pointer that helps the processor keep track of its place in the instruction stream by pointing to the next instruction to be fetched. • In order to run more than one process on the CPU, you need as many IPs as there are instruction streams keep track of. Or, equivalently, you could say that you need one IP for each logical processor. • Similarly, the Xeon has two register allocation tables (RATs), each of which handles the mapping of one logical processor's eight architectural integer registers and eight architectural floating-point registers onto a shared pool of 128 GPRs (general purpose registers) and 128 FPRs (floating-point registers). So the RAT is a replicated resource that manages a shared resource (the microarchitectural register file).
Statically partitioned queue Each queue is split in half It’s resources solely dedicated to use of one logical processor Partitioned Resources
Dynamically partitioned queue In a scheduling queue with 12 entries, instead of assigning entries 0 through 5 to logical processor 0 and entries 6 through 11 to logical processor 1, the queue allows any logical processor to use any entry but it places a limit on the number of entries that any one logical processor can use. So in the case of a 12-entry scheduling queue, each logical processor can use no more than six of the entries. Partitioned Resources
Shared Resources • Shared resources are at the heart of hyper-threading; they're what makes the technique worthwhile. • The more resources that can be shared between logical processors, the more efficient hyper-threading can be at squeezing the maximum amount of computing power out of the minimum amount of die space. • A class of shared resources consists of the execution units: the integer units, floating-point units, and load-store unit. • Hyper-threading's greatest strength--shared resources--also turns out to be its greatest weakness, as well. • Problems arise when one thread monopolizes a crucial resource.The problem here is the exact same problem that we discussed with cooperative multi-tasking: one resource hog can ruin things for everyone else. Like a cooperative multitasking OS, the Xeon for the most part depends on each thread to play nicely and to refrain from monopolizing any of its shared resources.
Since both logical processors share the same cache, the prospect of cache conflicts increase. This potential increase in cache conflicts has the potential to degrade performance seriously. Caching and Hyper Threading
Multi-tasking with Hyper Threading • Hyper Threading speeds up single threaded applications a little bit by handling the OS tasks in the background on the second logical CPU • Hyper Threading speeds up multiple single threaded applications quite a bit • Hyper Threading speeds up multithreaded applications a lot • But seems to have a little trouble with a multi thread applications and some single threaded applications at the same time. It seems that the Hyper threaded CPU cannot reach its full potential if one of the applications in the multitasking scenario is multi threaded and tries to keep both logical CPUs to itself.
Conclusions • Itis quite remarkable how almost every single threaded benchmark still got a small performance boost from Hyper Threading, between 1 and 5%. This shows that Hyper Threading has matured as it almost never decreased performance, as it did in the first hyper threaded Xeons. • Most multi-tasking scenarios were measurably faster with Hyper Threading on. • Hyper Threading is a very smart way to improve CPU performance. But is it more responsive? In some situations yes. Applications tend to load a bit faster and performance of the foreground task tends to suffer a bit. • Don't expect Hyper Threading to enable you to run two intensive tasks on your pc. Hyper Threading can enable you to perform relatively light tasks in background (like playing MP3s) while running games or other CPU intensive tasks, however.
Conclusions contd… • The people who will gain the most from Hyper Threading are those who like to run some typical multithreaded applications on their desktop, not the multi-tasking people. • If you like to compile, Animate, Encode MPEG4 or render on the same desktop system on which you play games, Hyper Threading as a lot to offer. • With Hyper Threading you get the fast gaming and single threaded performance of a typical desktop CPU, and at the same time, you get a Dual CPU system that is as fast as a lower clocked dual system.
Bibliography • http://www.pcworld.com/news/article/0,aid,107492,00.asp • http://www6.tomshardware.com/cpu/200203131/dual-06.html • http://arstechnica.com/paedia/h/hyperthreading/hyperthreading-1.html • http://www.slcentral.com/articles/01/6/multithreading/page11.php • http://www.2cpu.com/Hardware/ht_analysis/hyperthreading.doc • http://www.2cpu.com/Hardware/ht_analysis/3.html • http://www.pcworld.com/news/article/0,aid,107492,00.asp • http://www6.tomshardware.com/cpu/20021202/hyperthreading-01.html • http://www6.tomshardware.com/game/20021228/index.html • http://www.aceshardware.com/read.jsp?id=50000320