300 likes | 432 Views
Computer Architecture Principles Dr. Mike Frank. CDA 5155 Summer 2003 Module #7 General Quantitative Principles. 1.6. Quantitative Principles of Computer Design. Key points: Make the Common Case Fast! Amdahl’s Law (& its Generalizations) The CPU Performance Equation Measuring & modeling
E N D
Computer Architecture PrinciplesDr. Mike Frank CDA 5155Summer 2003 Module #7 General Quantitative Principles
1.6. Quantitative Principles of Computer Design Key points: • Make the Common Case Fast! • Amdahl’s Law (& its Generalizations) • The CPU Performance Equation • Measuring & modeling • Principle of Locality (temporal & spatial) • Take Advantage of Parallelism
General Systems Engineering • Let S be a physical system of any kind (electrical, mechanical, manufactory, economic, bureaucratic, political, etc.). • Let S be characterized by one or more numeric systemspecification variablesv1, v2, … • E.g., cost, size, weight, wattage, or amount of resources and/or time required by the system to perform some fixed, important task… • E.g., a run of an application program
Benefits Act of Decision Costs initiates The Process Resultingfrom the Decision(e.g. design decision) 9utilsof cost 11utilsof benefit Utility of the decision = Benefit − Cost = 11 u − 9 u = 2 u Accessible resourcesconsumed (renderedinaccessible) bythe process Resources produced(rendered accessible)by the process Benefit-Cost Model of Utility
Process #1:Benefit:Cost = 5:1 Process #2:Benefit:Cost = 3:1 Process #3:Benefit:Cost = 2:1 Costs Benefits Benefit/Cost Ratios When available resources are fixed, utility is maximizedby choosing thedesign with themaximum benefit/costratio(not the one withthe best benefit−costdifference)and replicatingit (as possible) untilall the available resources are spent. (Note also that B/CB−C when B >> C.)
Benefit/Cost Efficiency • We can define the efficiencyE of a design as the fraction (0-100%) of the highest possible benefit/cost ratio that is actually attained:E :≡(B/C)actual / (B/C)max • Goal of all engineering: Maximize system efficiency! (Accounting for all costs&benefits.) • Even if (B/C)max is unknown, we know that to maximize E, we should try to achieve the largest actual B/C that we know how to produce.
Typical Benefits of Computing • The benefits of computing are associated with the completion of definite computational tasks, such as: • Running a given program • Processing a certain kind database transaction • Serving up a given web page • Rendering a frame of an animation • Adjusting a carburetor’s fuel-air ratio • The beneficial effect of all computing tasks is to make an informational resource that has been processed into a desired form accessible to some external user or process.
Cost-Efficiency • For a process with fixed benefit (e.g., a computational task), efficiency simplifies nicely to just cost-efficiency: E = Ecost = Cmin / Cactual The fraction of the actual cost that really needed to be spent to achieve the fixed benefit. • Even if the minimum cost is unknown, we know that the efficiency (and quality) of the design will still be proportional to 1/Cactual • Maximize it by minimizing Cactual
Important Cost Categories in Computing • Hardware-Proportional Costs: • Initial Manufacturing Cost • Time-Proportional Costs: • Inconvenience to User Waiting for Result • (HardwareTime)-Proportional Costs: • Lifetime-Amortized Manufacturing Cost • Maintenance & Operation Costs • Opportunity Costs • Energy-Proportional Costs: • Adiabatic Losses • Non-adiabatic Losses From Bit Erasure • Note: These may both vary independently of (HWTime)! Focus oftraditionaltheories ofso-called“computationalcomplexity” These costsneed to beincluded also in practicaltheoreticalmodels ofnanocomputing
Total Unit Cost • Let CM be the total cost to create (manufacture) an instance (unit) of the system and deliver it initially to its users. • Let CO be the total cost to operate a system unit (may include electricity, maintenance, sysadmin labor, floor area rental, etc.) over its entire expected operational lifetime T. (Assuming heavy utilization.) • The total unitcostCTU= CM+CO. (Ignoring disposal costs.)
Per-Use Cost • Let the utilization of system units be divided up into individual uses, each conferring roughly a constant benefit B to the user (e.g., one run of one particular program). • This uniformity assumption is for simplicity. • Suppose there are N uses of a system unit over its lifetime if heavily utilized. • Divide up the total unit cost into N equal-sized chunks, and ascribe one to each use. • This is the per-use costCpu=CTU/N.
Generalized Cost Component • Definition: A generalized cost component is any specification variable v that has a linear influence on per-use cost Cpu that is roughly proportional to v (all else being equal). • I.e., Cpu = cvv + (other costs independent of v) • Examples: • Number of devices, chip area, manufacturing cost, average power consumption, benchmark execution time, system footprint, …
Additive Cost Expressions • Let an additive cost expression be an additive expression for Cpu in terms of n separate, independent generalized cost components vi.Cpu = c1v1 + c2v2 + …+ cnvn • Independence here means that changing one term doesn’t necessarily change another. • There may be many such expressions. E.g., • Break down Cpu for running a program into energy cost, manufacturing cost, opportunity cost, … • Break down Cpu of a program into time costs for integer instructions, floating-point, branches, …
Degree of Dominance • A given cost component vi is said to dominate the total cost to the degree • E.g., if component i costs 6 units and all other components together cost only 2 units, then component i dominates the total cost to the degree of 6/2 = 3. Cost of the particularcomponent in question. Total cost of all other components, added together.
Generalized Amdahl’s Law • Consider reducing a cost component vi by a factor f, while holding other components fixed. • I.e., let • Suppose that initially, the degree of dominance of vi was Di. • Total per-use cost is then reduced by (and cost-efficiency is increased by) the advantage factor: vi := vi / f to initial cost to final cost
Law of Diminishing Returns Part/rest (initial) ( Di) ( f )
Practical Implications • There is little practical benefit from reducing a single cost component, in isolation, far below the point where it is comparable to the sum of the other cost components. • Design effort should always tend to focus on minimizing major (dominant, Di>1), not minor (non-dominant, Di<1), cost components. • Good design will tend to result in systems with roughly comparable cost components, all of which must be improved together to significantly improve the whole product further.
Focus on Execution Time • In computer systems, the execution time of a task is the major generalized cost component in the total per-use (per-task) cost. • Execution time directly & proportionally determines the following real costs: • Opportunity cost to user lost while waiting for the desired result. • Usage of time-amortized manufacturing & installation cost of machine. • Cost to rent square feet of space where machine is located. • Energy consumed by constant-power components within a machine. • However, execution time alone is not really the only determiner of total real cost. Other, non-execution-time-proportional costs: • Cost to software engineering community of developing new software for the machine. Includes education/training costs. (See lecture #1.) • Energy consumption that depends on the type of operation performed - some energies may even be inversely proportional to time. (Adiabatics.) • For now, we ignore these other factors.
Focus of Amdahl’s Law proper • Restricts attention to execution time only. • We assume it has some additive measure t1+t2+t3... • The terms in this measure may consist of: • Time for different sequential stages of a process, e.g. • Delay in different functional units along a critical path. • Time to execute each successive section of code in a program. • Time for different types of operations performed during a process, e.g. • Integer ops vs. floating-point ops vs. load/store • Fetching data vs. instructions • Time for different modes, or ways of performing operations • e.g., fetching data from 1st-level cache, 2nd-level cache, main memory.
Amdahl’s Law - H&P Forms • Notational change vs. my version:fracenhanced = p/(p+1)p = fracenhanced / (1 fracenhanced)speedupenhanced = f
Graphical Visualization Amount of time taken Before improvement: b r Affected part Unaffected part After improvement: b g My “p” = Timeold= + Timenew= + Fracenhanced = + Speedupenhanced = + Speedupoverall = +
Using Visualization (example) • Given: and ,determine: & • Hint on exercise 1.3: it is asking for the above! • Let = b. You are given 2 equations with enough information to compute and as multiples of b, and you are asked to compute functions of these 3 values, in which b cancels out. • It’s just an algebra word problem, just do it! + + + +
Algebraic Set-Up (Time taken by) • Let r = mode to be enhanced, before speedupg = mode to be enhanced, after speedupb = other modes not enhanced at all. • Given:r/g = 10g/(g+b) = 50% (I.e. 0.5) • Find: (b+r)/(b+g) = ?r/(b+r) = ? You can do it!
Frequencies of Different FP Ops [What was the point of these figures from the 2nd edition?]
CPU Perf. Equation (review) • Some terms: • IC = Instruction Count (instructions / program) • CPI = Cycles Per Instruction (cycles / instruc.) • Clock period = time / clock cycle • Clock rate = clock cycles / time = 1/period • CPU time / program = 1 / (perf. on that pgm.) = (cycles/program) × (clock period) = IC × CPI × (clock period) = IC × CPI / (clock rate) • CPIs are often different for diff. inst. types
Cache Review (more later) • CPU tries to access data first in the cache. • If the data isn’t there, it is a miss and the processor stalls until the data can be fetched from memory. • Otherwise, it is a hit and execution continues. • Terms: • Miss rate: (# of misses)/(total # of accesses) • Miss penalty: (stall cycles / miss)
Computing Total Stall Time • Memory stall cycles= (Number of misses) x (miss penalty) = IC x (Misses / inst.) x (miss penalty) = IC x (Mem refs. / inst.) x (miss rate) x (miss penalty) • The number of mem. refs. / instruction may vary: • All instructions: 1 instruction load reference • Load/store instrs.: Also 1 data memory reference. • Note: Miss rates may differ between instruction and data references!