350 likes | 479 Views
Challenges in Exploiting Exponential Storage Gains. Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research Center http://www.research.microsoft.com/~gbell. Bottom Lines aka “Killer apps” for storage everywhere we look.
E N D
Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research Center http://www.research.microsoft.com/~gbell
Bottom Lines aka “Killer apps” for storage everywhere we look • “MyLifeBits” recording almost everything • The most cost-effective, highest volume stores: consumer & home PCs – for video. • Small form factor drives: pocket form factor cameras, phones, tablets, … e-books • Largest stores include Operating System, database, and interconnection via LANs/WANs and in the “cloud”
MyLifeBits, The Challenge of a One…1K Tbyte, lifetime PCs: Cyberizing everything…I’ve written, said, presented (incl. video), photos of physical objects & a few things I’ve read, heard, seenand might “want to see” on TV
"The PC is going to be the place where you store the information … really the center of control“ Billg 1/7/2001 MyLifeBits is an “on-going” project following CyberAll to “cyberize” all of personal bits! • Memory recall of books, CDs, communication, papers, photos, video • Photos of physical object collections • Elimination of all physical stores & objects • Content source for home media: ambiance, entertainment, communication, interaction Freestyle for CDs, photos, TV content, videos Goal: to understand the 1 TByte PC: need, utility, cost, feasibility, challenge & tools.
MyLifeBits charter: MemexAs We May Think - Vannevar Bush “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” “Selection by association, rather than indexing, may yet be mechanized “
Storing all we’ve read, heard, & seen Human data-types /hr /day (/4yr) /lifetime read text, few pictures 200 K 2 -10 M/G 60-300 G speech text @120wpm 43 K 0.5 M/G 15 G speech @1KBps 3.6 M 40 M/G 1.2 T stills w/voice @100KB 200 K 2 M/G 60 G video-like 50Kb/s POTS 22 M .25 G/T 25 T video 200Kb/s VHS-lite 90 M 1 G/T 100 T video 4.3Mb/s HDTV/DVD 1.8 G 20 G/T 1 P
User Context / Timelines Personal (including financial) Professional (work related) Archival Working Character of Cyber All Use
MyLifeBits use scenarios • Acquire from every potentially useful source including the web, voice and instant messages • Personal use of MLB for work to recall everything • Provide ambiance & entertainment: Personal/home broadcast, CD, Internet radio, TV screen saving • Creation of photo and video albums Events, places, trips, people, time intervals-------------- Database land -------------------------------------- • Personal/web hosted collections & catalogs • A Person (auto- or -biography web hosted time line Historical events by type; Personal time line Compile a life’s story about (event types, range, etc.) • Individual…How I spent my year. A personal diary. • ISBQ: Interactive Story By Query
ISBQ Editor Interface Query for media Query results can be dragged and dropped into timeline below Video and images can be added to HTML page Audio track for story
Why annotate…the future? • Future cameras have: • Creation time, content info e.g. people, scene type • GPS: place • Voice annotation about the shot and scene • Speech recognition of voice • Is annotation = meta-data about an object?
Imagine the “killer app” for: The One Tbyte, Lifetime, PC • MyLifeBits demonstrates need for lifetime memory! • MODI (Microsoft Office Document Imaging)! The most significant Office™ addition since HTML. • Technology to support the vision: • Guarantee that data will live forever! • A single index that includes mail, conversations, web accesses, and books! • E-book…e-magazines reach critical mass! • Telephony and audio capture are needed • Photo & video “index serving” • More meta-information … Office, photos • Lots of GUIs to improve ease-of-use
MyMainBrain storage • Everything stored in a database to facilitate searching, backup, complex attributes e.g. photo characteristics • Audio, video, images(?) may also be stored in file system (for access). • Ability to easily “annotate” and form “collections” of all the globs
The Home Digital Multimedia Network Vision: All digital content. IP on everything. Content source for home media: ambiance, entertainment, communication, and interaction Freestyle for CDs, photos, TV content, videos All listening/viewing stations will be digital. In the 10+year, short-term, Digital Transformers convert IP to legacy analog devices. Today Digital Transformers = computers!
SATELLITE TERRESTRIAL DIGITAL CABLE DSL-TELCO The Connected Home Peripherals Digital photos TV TV Gaming Screen devices Stereo
X* Spkr Home Networks: PC-based service DSL, etc. input • Servers: • Hold & deliver audio, photos, video • Encode TV content • Computers: • Control, get content from web, servers • Monitors: HDTV • TV-sets: receive encoded & CATV content • C* = computer. X = digital transformer. Home IP network C.srv X* X* X* Monitor Rec/AMP broadcast TVset HDTVTuner CATV Dist CATV Network
A Digital Transformer for Audio: Gateway’s Connected Home Audio Player
ACTIVY Media CenterOne H/W for multiple functions Reduces the number of devices, remotes and wires around the TV
Pioneer Plasma Panel with 1280 x 768 pixelsTV & Computer: Web Surfing at 12’
Disks are becoming computers • Smart drives • Camera with micro-drive • Replay / Tivo / Ultimate TV • Phone with micro-drive • MP3 players • Tablet • Xbox • Many more… ApplicationsWeb, DBMS, Files OS Disk Ctlr + 1Ghz cpu+ 1GB RAM Comm: Infiniband, Ethernet, radio… Courtesy of Jim Gray, Microsoft Bay Area Research
Chameleon: an XP/CE/Cellphone(800x300 pixels, 5 GB; 256 MB computer)
Disk As Tape: What format? • Today we ship NTFS/SQL disks. • But that is not a good format for Linux. • Solution: Ship NFS/CIFS/ODBC servers (not disks) • Plug “disk” into LAN. • DHCP then file or DB server via standard interface. • Web Service in long term Courtesy of Jim Gray
Gray’s $2.4 K, 1 TByte Sneakernet aka Disk Brick Cost to move a Terabyte Cost, time, and speed to move a Terabyte Cost of a “Sneaker-Net” Terabyte Courtesy of Jim Gray, Microsoft Bay Area Research
Cost, time of Sneaker-net vs Alts Courtesy of Jim Gray, Microsoft Bay Area Research
Google1.5PB as of last spring • 8,000 no-name PCs • Each 1/3U, 2 x 80 GB disk, 2 cpu 256MB ram • 1.4 PB online. • 2 TB ram online • 8 TeraOps • Slice-price is 1K$ so 8M$. • 15 admins (!) (== 1/100TB).
Bottom Line • The focus of computation has shifted from processing to storage. • Every app and price level is storage oriented from in/on body, personal, home servers, to large scale commercial and scientific apps • With databases, pre-computed indices beat exhaustive searches every time.