460 likes | 474 Views
Learn about MyLifeBits project aiming to fulfill the Memex vision presented by Gordon Bell in February 2003. Explore the transition from files to databases, long-term agenda, and more.
MyLifeBits: Attempting to realize the Memex Vision Gordon Bell February 2003 http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx With Jim Gemmell & Roger Lueder
Outline … MyLifeBits • Background…fulfilling the Memex vision • Cyberizing everything • File to database transition • Use…beyond search • Long-term agenda and outlook
MemexPosited by Vannevar Bush in “As We May Think” The Atlantic Monthly, July 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” Supports: Annotations, links between documents, and “trails” through the documents “yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so that he can be profligate and enter material freely”
Memory Overload As hard drives get bigger and cheaper, we're storing way too much. By Jim Lewis There's a famous allegory about a map of the world that grows in detail until every point in reality has its counterpoint on paper; the twist being that such a map is at once ideally accurate and entirely useless, since it's the same size as the thing it's meant to represent.
"The PC is going to be the place where you store the information and really the center of control“ Billg 1/7/2001 MyLifeBits is a project to “cyberize” everything! • What? Recall of all articles, books, CDs, photos, video, communication (e.g. mail, phone), web • Why? …“because we can” • Office: communicate, store, & work • Home & Media Center: ambiance &entertainment • Immortality for progeny. Memory aids • Goal: to understand the 1 TByte PC c2006: need, utility, cost, feasibility and tools.
Knowledge worker scenarios Gordon: Researcher, consumer, computer system tester, nerd wanna-be, and average man Melissa: middle manager Patrick: Consultant Nicholas: Analyst Sondra: Office manager
The guinea pig • Gordon Bell is digitizing his life • Has now scanned virtually all: • Books written (and read when possible) • Personal documents (correspondence including memos and email, bills, legal documents, papers written, …) • Photos • Posters, paintings, photo of things (artifacts, …medals, plaques) • Home movies and videos • CD collection • And, of course, all PC files • Now recording: phone, radio, TV (movies), web pages… conversations? • Paperless throughout 2002. 12” scanned, 12’ discarded. • Only 30 GB!!!
Input: tools, time, and cost • Scanners: HP Digital Sender, flat beds with ADF, 2-HP photo, faxing. (Duplex, color, feed-thru, etc.) • A good commercial scanner costs 2K-10K • Photos: $1 or 0.5-5 min. Large posters: ~ 1-5 hr.Artifacts: ~ 10 min. including photo • Scanning to TIF, PDF: <1 min/page or .10/page • OCR: for MODI or PDF: ~3-5 pages/min (old data) • OCR: to recreate an editable “original” 10 min/page! • OCR (Volume paper files): 400 pages/hr. 7 ppm. • Books: scanned at CMU ($10 - 100/book) in 1997 • Videos: tbd
CyberAll Nov.1, 2001 .pdf .tif .ppt .xls .gif .doc/html .jpg Music 6.9 GB 1.8K files 180 CDs Working 2.3 GB 432 folders 2.9K files Archive 5.1 GB 477 folders 18.7 K files .gif .ppt/ppt albums .pdf My Books 98 MB .tif Mail .7 GB43K msgs Video 2.6 GB 10 hours Low res .xls .jpg .doc/html 27.1K files & 42K .msg 17.7 GB (by size) Files (by number)
MyLifeBits organization: time and space Archival (time) Working Timeline/ Context(space) Personal (some $s) GB Co.(angel, etc.) Professional ACM, etc., … @Microsoft.com, New co’s.
MyLifeBits: Some Lives(t) • CGB@ Microsoft • MLB • Clusters • Telepresence • WWW presence • Computer History Museum • BOD member • Fund-raising • CyberMuseum • Startups • Bell-Mason Director • Diamond & Vanguard Brds. • Personal • Parents, children, grandkids • CGB himself • Close friends • GB $s • Personal incl. several legal structures • Investments & boards • Past companies/organiz’ns • DEC • Carnegie-Mellon U. • DEC, NSF, Encore, Ardent, GB_consulting,
MyLifeBits is: • Memex and more (audio and video) • Universal store for all personal stuff • Guiding principles for the system: • Full text search & collections (> than hierarchy) • Visualizations for search, display, insight • Annotations and links add value and essential • Increase search ability and value of information. • So make many kinds and them easy to create! • Stories are the ultimate annotation • Keep the links when you author: “transclusion”
MLB database: size and content? • Database features are essential: Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication. • Folders &Files were the starting point >> database into sets aka “collections” that are identical to the folder structure • Outlook (msgs, attachments, calendar, contacts) • Web trails including voice message annotation • Journal (Outlook), trails: every document use & transaction • What about? • Money (transactions, payees, etc.)…is their lifelog/trail • Streets and trips to cross-index to all docs • Attributes for photos for retrieval? Location, time, settings • Presentations as a report or trail. Each slide an object!
Legacy Legacy Legacy Redundant stereo Cassette Receiver Wfr Spkr Cables/links Speaker 5+1 Plasma 2 or 3 Cable/Enet 2 IR 8 Stereo 4 5.1 digital 2 Comp./S-video 3 Plasma panel 1 Power 10 Kbd/mse 2 Monitor II (opt.) 4 Camera 2 Total 42 – 46 Things 18+remotes stereo CD 5 speakers Spkr IR stereo Video* VCR 5.1 digital comp. DVD stereo Video* Set top Set top Cable/ Satellite Video* 5.1 digital Plasma Panel Media Center Computer Ethernet SVHS-wide Camera Mic *Video = composite or S-video Kbd Mse
Caneel Bay Vacation Jan. 1998 Gordon, Gwen, Brig, Pam, Fiona, Bob, Laura and Kolbe
Searching: the most useful app? • Challenge: What questions for useful results? • Lots of ways to look at what you retrieve • Need for breaking the returns into segments • Searching for an indexer and search engine: index service, Enfish, dtSearch • Stuff I’ve Seen MSR’s index & search… evolving in the right direction. • Productizing would remove the pressure for Longhorn
Resource explorerAncestor (collections), annotations, descendant& preview panes turned on
Visualization • Browsing & searching. “Get me what I want|need!” • Help the user find things among possible items versus • Waiting for an ideal system that can find “what I want” • Publication: Conventional & web, presentations, etc. • Helps understand the nature of the content e.g. histogram of objects in time • Context: Links to help understand the relationship between objects. Provides more search handles. • Information density: what is it? What is its relationship to others? • Content important. Flash and form, less useful.
Value of media depends on annotations • “Its just bits until it is annotated”
System annotations provide base level of value • Date 7/7/2000
Tracking usage – even better • Date 7/7/2000. Opened 30 times, emailed to 10 people (its valued by the user!)
Get the user to say a little something is a big jump • Date 7/7/2000. Opened 30 times, emailed to 10 people. “BARC dim sum intern farewell Lunch”
Getting the user to tell a story is the ultimate in media value • A story is a “layout” in time and space • Most valuable content (by selection, and by being well annotated) • Stories must include links to any media they use (for future navigation/search – “transclusion”). • Cf: MovieMaker; Creative Memories PhotoAlbums We took him to lunch at our favorite Dim Sum place to say farewell Dapeng was an intern at BARC for the summer of 2000 At table L-R: Dapeng, Gordon, Tom, Jim, Don, Vicky, Patrick, Jim
Value of media depends on annotations “Its just bits until it is annotated” • Auto-annotate whenever possible e.g. GPS cameras • Make manual annotation as easy as possible. XP photo capture, voice, photos with voice, etc • Support gang annotation • Make stories easy
The Agenda for the Tbyte(s), Lifetime, PC:The killer app after office and mail. • Guarantee that data will live forever! “dear appy” problem • Cheap, easy, and data-rich (e.g. time, place) capture: GPS and time everywhere Paper capture has to be as easy as discard (scanner/shredder) E-book…e-magazines & journals need to have critical mass! Telephony and audio capture with indexing Media Center compatible for entertainment (photos, video, TV, radio) • One? dbase for all books, conversations, mail, web pages …vs. long-term use of hierarchical files. Is dbase intuitive? • Annotations/meta-information add every-increasing value Ease of annotation because it aids search and becomes the content Content analysis (critical for photo & video!) • Information control: privacy, security, expunge/deniability,… • New “killer apps”: alzheimer, immortality, surrogate memory? • Any GUI to improve use (e.g. time to learn, use, retention)
The “dear appy” problem Dear Appy, How committed are you? Please come back to me, Lost and forgotten data • Who’s responsible? • media • platform, file, and databases • evolving standards and formats • evolving and/or disappearing apps
Digitizing our lives • Right now, it is affordable to buy 100 GB/year • In 5 years 1TB/year is afforadable! • It’s hard to fill a terabyte/year just by keeping what you see or hear, but you can: • Look at 9800 pictures a day (300 KB JPEGs) • Read 2900 documents a day (1MB files) • Listening to audio or view compressed video 24 hours/day (it takes more than 256 kb/s to fill a TB in a year) • Watch 1.5 Mb/s video 4 hours each day. • As Bush said, we can “be profligate and enter material freely”