250 likes | 358 Views
Chapter 2 Computer Performance Examples and Cost. Advanced Computer Architecture COE 501. Comparing CPU Time. CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle.
E N D
Chapter 2Computer Performance Examples and Cost Advanced Computer Architecture COE 501
Comparing CPU Time CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle • A 500 MHz Pentium III processor takes 2 ms to run a program with 200,000 instructions. • A 300 MHz UltraSparc processor takes 1.8 ms to run the same program with 230,000 instructions. • What is the CPI for each processor for this program? CPI = Cycles / Instruction Count = CPU time X Clock Rate / Instruction Count CPIPentium = 2*10-3 X 500*106 / 2*105 = 5.00 CPISPARC = 1.8*10-3 X 300*106 / 2.3*105 = 2.35 • Which processor is faster and by how much? The UltraSparc is 2/1.8 = 1.11 times as fast, or 11% faster.
Cycles Per Instruction “Average Cycles per Instruction” CPU Time = Cycle Time * Number of cycles Invest resources where time is spent! n CPU time = CycleTime * ΣCPI * IC i i i = 1 “Instruction Frequency” n CPI = ΣCPI * F where F = IC i i i i i = 1 Instruction Count
Example: Calculating CPI Base Machine (Reg / Reg) Op Freq Cycles Fi*CPIi (% Time) ALU 50% 1 .5 (33%) Load 20% 2 .4 (27%) Store 10% 2 .2 (13%) Branch 20% 2 .4 (27%) 1.5 Typical Mix
Example • Add register / memory ALU operations: • One source operand in memory • One source operand in register • Cycle count of 2 • Branch cycle count to increase to 3. • What fraction of the loads must be eliminated for this to pay off, assuming the clock rate is not affected? Base Machine (Reg / Reg) Op Freq Cycles ALU 50% 1 Load 20% 2 Store 10% 2 Branch 20% 2 Typical Mix
Example Solution Exec Time = Instr Cnt x CPI x Clock Op Freq Cycles ALU .50 1 .5 Load .20 2 .4 Store .10 2 .2 Branch .20 2 .4 Reg/Mem 1.00 1.5
Example Solution Exec Time = Instr Cnt x CPI x Clock Op Freq Cycles Freq Cycles ALU .50 1 .5 .5 – X 1 .5 – X Load .20 2 .4 .2 – X 2 .4 – 2X Store .10 2 .2 .1 2 .2 Branch .20 2 .3 .2 3 .6 Reg/Mem X 2 2X 1.00 1.5 1 – X (1.7 – X)/(1 – X) CyclesNew InstructionsNew CPINew must be normalized to new instruction frequency X is the fraction of register-memory instructions.
Example Solution Exec Time = Instr Cnt x CPI x Clock Op Freq Cycles Freq Cycles ALU .50 1 .5 .5 – X 1 .5 – X Load .20 2 .4 .2 – X 2 .4 – 2X Store .10 2 .2 .1 2 .2 Branch .20 2 .3 .2 3 .6 Reg/Mem X 2 2X 1.00 1.5 1 – X (1.7 – X)/(1 – X) Instr CntOld x CPIOld x ClockOld = Instr CntNew x CPINew x ClockNew 1.00 x 1.5 = (1 – X) x (1.7 – X)/(1 – X)
Example Solution Exec Time = Instr Cnt x CPI x Clock Op Freq Cycles Freq Cycles ALU .50 1 .5 .5 – X 1 .5 – X Load .20 2 .4 .2 – X 2 .4 – 2X Store .10 2 .2 .1 2 .2 Branch .20 2 .3 .2 3 .6 Reg/Mem X 2 2X 1.00 1.5 1 – X (1.7 – X)/(1 – X) Instr CntOld x CPIOld x ClockOld = Instr CntNew x CPINew x ClockNew 1.00 x 1.5 = (1 – X) x (1.7 – X)/(1 – X) 1.5 = 1.7 – X 0.2 = X ALL loads must be eliminated for this to be a win!
Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Final test yield: The fraction of packaged dies which pass the final testing state.
Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Die yield: The fraction of good dies on a wafer, before packaging.
Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Dies Π*( Wafer_diam/2)2 Π*Wafer_diam Wafer Die Area Sqrt(2*Die Area) = – – Test dies
- a a Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer = Π * ( Wafer_diam / 2)2 – Π * Wafer_diam – Test dies Die Area Sqrt(2*Die Area) Die Yield = Wafer yield* {1+Defects_per_unit_area * Die_Area} Defects per unit area (0.6 to 1.2), Fabrication complexity: a (about 3) Die cost increases roughly as die area4
Real World Examples Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost layers width cost /cm2 mm2 wafer 386DX 2 0.90 $900 1.0 43 360 71% $4 486DX2 3 0.80 $1200 1.0 81 181 54% $12 PowerPC 601 4 0.80 $1700 1.3 121 115 28% $53 HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73 DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149 SuperSPARC 3 0.70 $1700 1.6 256 48 13% $272 Pentium 3 0.80 $1500 1.5 296 40 9% $417 • From "Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15 • New products end up being much more expensive to manufacture
Other Costs Die Test Cost = Test Cost * Ave. Test Time Die Yield Packaging Cost: depends on pins, heat dissipation, appearance, ... • Chip Die Package Test & Total cost pins type cost Assembly • 386DX $4 132 QFP $1 $4 $9 • 486DX2 $12 168 PGA $11 $12 $35 • PowerPC 601 $53 304 QFP $3 $21 $77 • HP PA 7100 $73 504 PGA $35 $16 $124 • DEC Alpha $149 431 PGA $30 $23 $202 • SuperSPARC $272 293 PGA $20 $34 $326 • Pentium $417 273 PGA $19 $37 $473
Chip Prices (August 1993) • Chip Area Mfg. Price Multi- Comment • mm2 cost plier • 386DX 43 $9 $31 3.4 Intense Competition • 486DX2 81 $35 $245 7.0No Competition • PowerPC 601 121 $77 $280 3.6 Gain market share • DEC Alpha 234 $202 $1231 6.1Recoup R&D • Pentium 296 $473 $965 2.0 Early in shipments • Assume purchase 10,000 units
Power Dissipation Source: Intel • Lead processor power increases every generation • Compactions provide higher performance at lower power
Workstation Costs • DRAM: 50% to 55% • Color Monitor: 15% to 20% • CPU board: 10% to 15% • Hard disk: 8% to 10% • CPU cabinet: 3% to 5% • Video & other I/O: 3% to 7% • Keyboard, mouse: 1% to 2%
Volume vs. Cost • Rule of thumb on applying learning curve to manufacturing: “When volume doubles, costs reduce 10%” A DEC View of Computer Engineering by C. G. Bell, J. C. Mudge, and J. E. McNamara, Digital Press, Bedford, MA., 1978. • 40 MPPs @ 200 nodes = 8,000 nodes/year vs. 100,000 Workstations/year 2X = (100,000/8,000) => x = 3.6 • Since doubling value reduces cost by 10%, costs reduces to (0.9)3.6 = 0.68 of the original price (about 1/3 less expensive).
Volume vs. Cost: PCs vs. Workstations 1990 1992 1994 1997 PC 23,880,898 33,547,589 44,006,000 65,480,000 WS 407,624 584,544 679,320 978,585 Ratio 59 57 65 67 • 2x = 65 => X = 6.0 and(0.9)6.0 = 0.53 PC costs are 47% less than workstation costs for whole market. Single company: 20% WS market vs. 10% PC market Ratio 29 29 32 33 • 2x = 32 => X = 5.0 and(0.9)5.0 = 0.59 PCs cost 41% less than workstations for single company.
Learning Curve production costs volume Years time to introduce new product
High Margins on High-End Machines • R&D considered return on investment • Most companies spend 4% to 12% of income on R&D. • Every $1 R&D must generate $8 to $25 in sales • High end machines need more $ for R&D • Sell fewer high end machines • Fewer to amortize R&D • Much higher cost margins • Cost of 1 MB Memory (January 1994): PC $40 (Mac Quadra) WS $42 (SS-10) Mainframe $1920 (IBM 3090) Supercomputer $600 (M90 DRAM) $1375 (C90 15 ns SRAM)
Average Discount Gross Margin Component Cost Cost/PerformanceWhat is Relationship of Cost to Price? • Component Costs • Direct Costs(add 25% to 40%) recurring costs: labor, purchasing, scrap, warranty • Gross Margin(add 82% to 186%) nonrecurring costs: R&D, marketing, sales, equipment maintenance, rental, financing cost, pretax profits, taxes • Average Discountto get List Price (add 33% to 66%): volume discounts and/or retailer markup List Price 25% to 40% Avg. Selling Price 34% to 39% 6% to 8% Direct Cost 15% to 33%