1 / 39

INTRODUCTION

INTRODUCTION. Jehan-François Pâris jparis@uh.edu. An evolving field. Computer architectures keep changing Building faster computers Supercomputers and data centers Building cheaper, smaller computers Laptops, notebooks, netbooks, smartbooks Putting computer systems everywhere

butch
Download Presentation

INTRODUCTION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INTRODUCTION Jehan-François Pâris jparis@uh.edu

  2. An evolving field • Computer architectures keep changing • Building faster computers • Supercomputers and data centers • Building cheaper, smaller computers • Laptops, notebooks, netbooks, smartbooks • Putting computer systems everywhere • Cars, cell phones, HDTV:embedded computers

  3. An analogy • Electrical motors • Replaced the single steam engine powering many machines through transmission belts and pulleys • One electrical motor per machine • Domestic appliances, car starters, … • Power tools • Power windows, electrical toothbrushes, …

  4. The coming revolution • Cannot increase CPU clock frequency above2 GHz without running into unsolvableheat dissipation problems • Switch to multicore architectures • Two, four, eight, … CPUs per chip • Creates new problems • Hardware: cache synchronization • Software: programming these beasts Ouch!

  5. Other challenges • Reducing power consumption of data centers • Often contain archival data that arevery rarely accessed • Finding new ways to keep increasing magnetic disk capacity • Dealing with physical limits to SDRAM density • Will never get 8 TB SODIMM modules • Finding a replacement for hard drives

  6. Classical computer components • Input • Output • Memory • Datapath • Control • Datapath + Control = Processor • Storage subsystem is missing!

  7. A laptop motherboard

  8. The course philosophy • Showing you how computer work is fine • Showing you how to make them faster is better!

  9. PERFORMANCE ISSUES • Defining performance • Measuring it • Not an easy task • Evaluating the impact of • Amount of work done by each instruction • Time they take to run • CPU clock speed

  10. Measuring Performance • Inverse of execution time of a benchmark Performance = 1/Execution Time • If computers A and B are such that Execution TimeA < Execution TimeB for the same benchmark, then PerformanceA > PerformanceB

  11. SPEC CPU Benchmark • SPEC CPU2006 • Set of 12 integer and 17 floating-point benchmarks • Results are normalized: Execution on a reference processor /Execution on benchmarked processor • Single value is geometric mean of these ratios

  12. How is it computed (I) • Two new processors P and Q compared toa reference processor R • Execution times for n benchmarks • P1, P2, …, Pn • Q1, Q2, …, Qn • R1, R2, …, Rn

  13. How it is computed • SPEC value for processor P is • Observe that • (property of geometric mean)

  14. Impact of Instruction Set • Execution Time =Number of Instructions ×Mean Instruction Execution Time • Gave birth to the idea of more complex instruction sets • Each does more • Fewer instructions

  15. Impact of Clock Speed • Execution Time =Number of Clock Cycles × Clock Cycle Timesame asExecution Time =Number of Clock Cycles / Clock Frequency

  16. Putting everything together • Execution Time =Number of Instructions ×Number of Clock Cycles per Instruction ×Clock Cycle Time • Gives us three ways to reduce program execution time

  17. 1. Using fewer instructions • VAX • Super minicomputer designed in late 70’s • Had a complicated instruction set (CISC) • Idea was to use more powerful instructions in order to reduce the number of instructions used to perform most frequent tasks • Poor pipelining performance

  18. 2. Using a faster clock • Major reason for explosion of CPU performance in the 80’s and 90’s • IBM PC (1981):Intel 8088 @ 4.77 MHz • IBM PC AT (1984):Intel 80286 @ 6 and 8 MHz • Nowadays up to 3 GHz • Cannot get much higher!

  19. 3. Using better instructions • Best strategy is to reduce the average number of clock cycles per instruction • Privileging fast instructions • Using fixed-size instructions to allow pipelining • Trying to execute as many tasks as possible in parallel

  20. Amdahl’s Law (I) • Examples: • Supersonic jet • Could fly from Houston to Washington in thirty minutes • Total travel time would be dominated by travel time to airport and check in procedures • Today's laptops: • Disk access times are the bottleneck

  21. Amdahl’s Law (II) • Assume that we have a technique for improving the performance of some part of a system. • Let • To be the time originally spent in the part of the system that can be improved • Ti be the time spent in that part once the improvement has been applied • Tn be the time spent in in the part of the system that remains unaffected

  22. Amdahl’s Law (III) • The total speedup for the whole system will be • The maximum possible speedup when Ti 0

  23. An example • Flying to Washington National Airport takes three hours • Going to the airport and waiting for the flight takes a minimum of two hours • Going from the airport to Washington downtown takes a minimum of 30 minutes • What is the maximum speedup that could be achieved using much faster planes? 5h30 / 2h30 = 2.2

  24. Answer • Current travel time: • To airport and wait: 2 hours • Plane: 3 hours • To downtown by DC metro: 30 minutes • Total: 5 hours 30 minutes

  25. Answer • Assume plane travels at speed of light: • To airport and wait: 2 hours • Plane: negligible • To downtown by DC metro: 30 minutes • Total: 2 hours 30 minutes • Maximum speedup would be 5h30 / 2h30 = 2.2

  26. Train and busses • Commuter trains and city busses spend significant amount of trip time debarking and embarking travelers • Have wide doors • Not true for Amtrak train and intercity buses • Fewer narrower doors

  27. Train and busses

  28. A problem • Assume we have a technique to improve the speed of floating-point operations by 20 percent • What will be the overall CPU speedup if we expect it to spend 10 percent of its time executing floating point operations? • How would that speedup be affected if the CPU spends 30 percent of its time executing floating point operations?

  29. Solution (I) • First case: • Baseline time = 0.9 × 1 + 0.1 × 1 = 1 • After improvement = 0.9 × 1 + 0.1 × 0.8 = 0.98 • Speedup = 1/0.98 = 1.02 • A 2 percent improvement!

  30. Solution (II) • Second case: • Baseline time = 0.7 × 1 + 0.3 × 1 = 1 • After improvement = 0.7 × 1 + 0.7 × 0.8 = 0.94 • Speedup = 1/0.94 = 1.064 • A 6.4 percent improvement!

  31. REVIEW PROBLEMS

  32. Problem • Consider a huge program that consists of a purely sequential part that takes two hours and another part that takes eight hours.What is the maximum speedup we can achieve by parallelizing the second part of the program?

  33. Answer • Current run time: • Sequential part: 2 hours • Other part: 8 hours • Total: 10 hours • Minimum run time: • Sequential part: 2 hours • Other part: negligible • Total: 2 hours

  34. Answer • Current run time: • Sequential part: 2 hours • Other part: 8 hours • Total: 10 hours • Minimum run time: • Sequential part: 2 hours • Other part: negligible • Total: 2 hours Maximumspeed up10/2 = 5

  35. Problem • Server motherboard A has a SPEC CPU2006 rating of 31.4 while server motherboard B has a rating of 29.7. Which one of the two motherboards is faster?

  36. Answer • Server motherboard A has a SPEC CPU2006 rating of 31.4 while server motherboard B has a rating of 29.7. Which one of the two motherboards is faster? • Motherboard A because a higher SPEC value is better

  37. Fun problem • Shanghai maglev train runs at 268 mph • How does it compare to airplane for going between Houston and Washington, DC?

  38. Fun answer • Current travel time: • To airport and wait: 2 hours • Plane: 3 hours • To downtown by DC metro: 30 minutes • Total: 5 hours 30 minutes • With maglev: • To station: 1 hour • Train to downtown DC: 6 hours 30 minutes • Total: 7 hours 30 minutes

  39. Fun answer • Current travel time: • To airport and wait: 2 hours • Plane: 3 hours • To downtown by DC metro: 30 minutes • Total: 5 hours 30 minutes • With maglev: • To station: one hour • Train to downtown DC: 6 hours 30 minutes • Total: 7 hours 30 minutes Plane is still fasterfor very long trips

More Related