470 likes | 769 Views
Rules of Thumb in Data Engineering. Jim Gray CMU 8 Oct 2001 Gray@Microsoft.com , http://research.Microsoft.com/~Gray/Talks/. Outline. Moore’s Law and consequences Storage rules of thumb Balanced systems rules revisited Networking rules of thumb Caching rules of thumb.
E N D
Rules of Thumb in Data Engineering Jim Gray CMU 8 Oct 2001 Gray@Microsoft.com, http://research.Microsoft.com/~Gray/Talks/
Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb
Meta-Message: Technology Ratios Matter • Price and Performance change. • If everything changes in the same way, then nothing really changes. • If some things get much cheaper/faster than others, then that is real change. • Some things are not changing much: • Cost of people • Speed of light • … • And some things are changing a LOT
Trends: Moore’s Law • Performance/Price doubles every 18 months • 100x per decade • Progress in next 18 months = ALL previous progress • New storage = sum of all old storage (ever) • New processing = sum of all old processing. • E. coli double ever 20 minutes! 15 years ago
Trends: ops/s/$ Had Three Growth Phases 1890-1945 Mechanical Relay 7-year doubling 1945-1985 Tube, transistor,.. 2.3 year doubling 1985-2000 Microprocessor 1.0 year doubling
So: a problem • Suppose you have a ten-year compute job on the world’s fastest supercomputer. What should you do. • ? Commit 250M$ now? • ? Program for 9 years Software speedup: 26 = 64x Moore’s law speedup: 26 = 64x so 4,000x speedup: spend 1M$ (not 250M$ on hardware) runs in 2 weeks, not 10 years. • Homework problem: What is the optimum strategy?
Storage capacity beating Moore’s law 2 k$/TB today (raw disk) 1k$/TB by end of 2002
Consequence of Moore’s law:Need an address bit every 18 months. • Moore’s law gives you 2x more in 18 months. • RAM • Today we have 10 MB to 100 GB machines(24-36 bits of addressing) then • In 9 years we will need 6 more bits: 30-42 bit addressing (4TB ram). • Disks • Today we have 10 GB to 100 TB file systems/DBs(33-47 bit file addresses) • In 9 years, we will need 6 more bits40-53 bit file addresses (100 PB files)
Architecture could change this • 1-level store: • System 48, AS400 has 1-level store. • Never re-uses an address. • Needs 96-bit addressing today. • NUMAs and Clusters • Willing to buy a 100 M$ computer? • Then add 6 more address bits. • Only 1-level store pushes us beyond 64-bits • Still, these are “logical” addresses, 64-bit physical will last many years
Trends: Gilder’s Law: 3x bandwidth/year for 25 more years • Today: • 40 Gbps per channel (λ) • 12 channels per fiber (wdm): 500 Gbps • 32 fibers/bundle = 16 Tbps/bundle • In lab 3 Tbps/fiber (400 x WDM) • In theory 25 Tbps per fiber • 1 Tbps = USA 1996 WAN bisection bandwidth • Aggregate bandwidth doubles every 8 months! 1 fiber = 25 Tbps
Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb
How much storage do we need? Yotta Zetta Exa Peta Tera Giga Mega Kilo Everything! Recorded • Soon everything can be recorded and indexed • Most bytes will never be seen by humans. • Data summarization, trend detection anomaly detection are key technologies See Mike Lesk: How much information is there: http://www.lesk.com/mlesk/ksg97/ksg.html See Lyman & Varian: How much information http://www.sims.berkeley.edu/research/projects/how-much-info/ All BooksMultiMedia All LoC books (words) .Movie A Photo A Book 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli
Storage Latency: How Far Away is the Data? Andromeda 9 10 Tape /Optical 2,000 Years Robot 6 Pluto Disk 2 Years 10 1.5 hr Springfield 100 Memory This Campus 10 10 min On Board Cache 2 On Chip Cache This Room 1 Registers My Head 1 min
15 2 10 10 12 0 10 10 9 -2 10 10 6 -4 10 10 3 -6 10 10 Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Price vs Speed Size vs Speed Nearline Cache Tape Offline Main Tape Disc Secondary Online Online Secondary $/MB Tape Tape Disc Typical System (bytes) Main Offline Nearline Tape Tape Cache -9 -6 -3 0 3 -9 -6 -3 0 3 10 10 10 10 10 10 10 10 10 10 Access Time (seconds) Access Time (seconds)
Disks: Today • Disk is 18GB to 180 GB10-50 MBps5k-15k rpm (6ms-2ms rotational latency)12ms-7ms seek2K$/IDE-TB, 7k$/SCSI-TB • For shared disks most time spent waiting in queue for access to arm/controller Wait Transfer Transfer Rotate Rotate Seek Seek
Standard Storage Metrics • Capacity: • RAM: MB and $/MB: today at 512MB and 200$/GB • Disk: GB and $/GB: today at 80GB and 70k$/TB • Tape: TB and $/TB: today at 40GB and 10k$/TB (nearline) • Access time (latency) • RAM: 100 ns • Disk: 15 ms • Tape: 30 second pick, 30 second position • Transfer rate • RAM: 1-10 GB/s • Disk: 10-50 MB/s - - -Arrays can go to 10GB/s • Tape: 5-15 MB/s - - - Arrays can go to 1GB/s
New Storage Metrics: Kaps, Maps, SCAN • Kaps: How many kilobyte objects served per second • The file server, transaction processing metric • This is the OLD metric. • Maps: How many megabyte objects served per sec • The Multi-Media metric • SCAN: How long to scan all the data • the data mining and utility metric • And • Kaps/$, Maps/$, TBscan/$
Disk Changes • Disks got cheaper: 20k$ -> 200$ • $/Kaps etc improved 100x (Moore’s law!) (or even 500x) • One-time event (went from mainframe prices to PC prices) • Disk data got cooler (10x per decade): • 1990 disk ~ 1GB and 50Kaps and 5 minute scan • 2001 disk ~160GB and 120Kaps and 1 hour scan • So • 1990: 1 Kaps per 20 MB • 2001: 1 Kaps per 1,000 MB • disk scans take longer (10x per decade) • Backup/restore takes a long time (too long)
Data on Disk Can Move to RAM in 10 years 100:1 10 years
The “Absurd” 10x (=4 year) Disk • 2.5 hr scan time (poor sequential access) • 1 aps / 5 GB (VERY cold data) • It’s a tape! 1 TB 100 MB/s 200 Kaps
Disk 80 GB 20 MBps 5 ms seek time 3 ms rotate latency 3$/GB for drive 3$/GB for ctlrs/cabinet 15 TB/rack 1 hour scan Tape 40 GB 10 MBps 10 sec pick time 30-120 second seek time 2$/GB for media8$/GB for drive+library 10 TB/rack 1 week scan Disk vs Tape Guestimates Cern: 200 TB 3480 tapes 2 col = 50GB Rack = 1 TB = 8 drives The price advantage of tape is narrowing, and the performance advantage of disk is growing At 10K$/TB, disk is competitive with nearline tape.
It’s Hard to Archive a PetabyteIt takes a LONG time to restore it. • At 1GBps it takes 12 days! • Store it in two (or more) places online (on disk?).A geo-plex • Scrub it continuously (look for errors) • On failure, • use other copy until failure repaired, • refresh lost copy from safe copy. • Can organize the two copies differently (e.g.: one by time, one by space)
Auto Manage Storage • 1980 rule of thumb: • A DataAdmin per 10GB, SysAdmin per mips • 2000 rule of thumb • A DataAdmin per 5TB • SysAdmin per 100 clones (varies with app). • Problem: • 5TB is 50k$ today, 5k$ in a few years. • Admin cost >> storage cost !!!! • Challenge: • Automate ALL storage admin tasks
How to cool disk data: • Cache data in main memory • See 5 minute rule later in presentation • Fewer-larger transfers • Larger pages (512-> 8KB -> 256KB) • Sequential rather than random access • Random 8KB IO is 1.5 MBps • Sequential IO is 30 MBps (20:1 ratio is growing) • Raid1 (mirroring) rather than Raid5 (parity).
Summarizing storage rules of thumb (1) • Moore’s law: 4x every 3 years 100x more per decade • Implies 2 bit of addressing every 3 years. • Storage capacities increase 100x/decade • Storage costs drop 100x per decade • Storage throughput increases 10x/decade • Data cools 10x/decade • Disk page sizes increase 5x per decade.
Summarizing storage rules of thumb (2) • RAM:Disk and Disk:Tape cost ratios are 100:1 and 3:1 • So, in 10 years, disk data can move to RAM since prices decline 100x per decade. • A person can administer a million dollars of disk storage: that is 1TB - 100TB today • Disks are replacing tapes as backup devices.You can’t backup/restore a Petabyte quicklyso geoplex it. • Mirroring rather than Parity to save disk arms
Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb
System Bus PCI Bus 1 PCI Bus 2 Standard Architecture (today)
Amdahl’s Balance Laws • parallelism law: If a computation has a serial part S and a parallel component P, then the maximum speedup is (S+P)/S. • balanced system law: A system needs a bit of IO per second per instruction per second:about 8 MIPS per MBps. • memory law:=1:the MB/MIPS ratio (called alpha ()), in a balanced system is 1. • IO law: Programs do one IO per 50,000 instructions.
Amdahl’s Laws Valid 35 Years Later? • Parallelism law is algebra: so SURE! • Balanced system laws? • Look at tpc results (tpcC, tpcH) at http://www.tpc.org/ • Some imagination needed: • What’s an instruction (CPI varies from 1-3)? • RISC, CISC, VLIW, … clocks per instruction,… • What’s an I/O?
MHz/ cpu CPI mips KB/ IO IO/s/ disk Disks Disks/ cpu MB/s/ cpu Ins/ IO Byte Amdahl 1 1 1 6 8 TPC-C= random 550 2.1 262 8 100 397 50 40 7 TPC-H= sequential 550 1.2 458 64 100 176 22 141 3 TPC systems • Normalize for CPI (clocks per instruction) • TPC-C has about 7 ins/byte of IO • TPC-H has 3 ins/byte of IO • TPC-H needs ½ as many disks, sequential vs random • Both use 9GB 10 krpm disks (need arms, not bytes)
Amdahl’s Balance Laws Revised • Laws right, just need “interpretation” (imagination?) • Balanced System Law:A system needs 8 MIPS/MBpsIO, but instruction rate must be measured on the workload. • Sequential workloads have low CPI (clocks per instruction), • random workloads tend to have higher CPI. • Alpha (the MB/MIPS ratio) is rising from 1 to 6. This trend will likely continue. • One Random IO’s per 50k instructions. • Sequential IOs are larger One sequential IO per 200k instructions
Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb
Networking • WANS are getting faster than LANSG8 = OC192 = 8Gbps is “standard” • Link bandwidth improves 4x per 3 years • Speed of light (60 ms round trip in US) • Software stackshave always been the problem. Time = SenderCPU + ReceiverCPU + bytes/bandwidth This has been the problem
How much does wire-time cost?$/Mbyte? Cost Time • Gbps Ethernet .2µ$ 10 ms • 100 Mbps Ethernet .3µ$ 100 ms • OC12 (650 Mbps) .003$ 20 ms • DSL .0006$ 25 sec • POTs .002$ 200 sec • Wireless: .80$ 500 sec
Data delivery costs 1$/GB today • Rent for “big” customers: 300$/megabit per second per month • Improved 3x in last 6 years (!). • That translates to 1$/GB at each end. • You can mail a 160 GB disk for 20$. • That’s 16x cheaper • If overnight it’s 3 MBps. 3x160 GB ~ ½ TB
Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb
The Five Minute Rule • Trade DRAM for Disk Accesses • Cost of an access (Drive_Cost / Access_per_second) • Cost of a DRAM page ( $/MB/ pages_per_MB) • Break even has two terms: • Technology term and an Economic term • Grew page size to compensate for changing ratios. • Now at 5 minutes for random, 10 seconds sequential
The 5 Minute Rule Derived Breakeven: RAM_$_Per_MB = _____DiskPrice . PagesPerMB T x AccessesPerSecond Disk Access Cost /T DiskPrice . AccessesPerSecond ( )/T Cost a RAM Page RAM_$_Per_MB PagesPerMB $ T =TimeBetweenReferences to Page T = DiskPrice x PagesPerMB . RAM_$_Per_MB x AccessPerSecond
Plugging in the Numbers • Trend is longer times because disk$ not changing much, RAM$ declining 100x/decade 5 Minutes & 10 second rule
The 10 Instruction Rule • Spend 10 instructions /second to save 1 byte • Cost of instruction: I =ProcessorCost/MIPS*LifeTime • Cost of byte: B = RAM_$_Per_B/LifeTime • Breakeven: NxI = B N = B/I = (RAM_$_B X MIPS)/ ProcessorCost ~ (3E-6x5E8)/500 = 3 ins/B for Intel ~ (3E-6x3E8)/10 = 10 ins/B for ARM
When to Cache Web Pages. • Caching saves user time • Caching saves wire time • Caching costs storage • Caching only works sometimes: • New pages are a miss • Stale pages are a miss
Web Page Caching Saves People Time • Assume people cost 20$/hour (or .2 $/hr ???) • Assume 20% hit in browser, 40% in proxy • Assume 3 second server time • Caching saves people time 28$/year to 150$/year of people time or .28 cents to 1.5$/year.
Web Page Caching Saves Resources • Wire cost is penny (wireless) to 100µ$ LAN • Storage is 8 µ$/mo • Breakeven: wire cost = storage rent4 to 7 months • Add people cost: breakeven is ~ 4 years.“cheap people” (.2$/hr) 6 to 8 months.
Caching • Disk caching • 5 minute rule for random IO • 10 second rule for sequential IO • Web page caching: • If page will be re-referenced in 18 months: with free users 15 years: with valuable usersthen cache the page in the client/proxy. • Challenge: guessing which pages will be re-referenceddetecting stale pages (page velocity)
Meta-Message: Technology Ratios Matter • Price and Performance change. • If everything changes in the same way, then nothing really changes. • If some things get much cheaper/faster than others, then that is real change. • Some things are not changing much: • Cost of people • Speed of light • … • And some things are changing a LOT
Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb