650 likes | 675 Views
Telepresence: An Umbrella Research Topic. Jim Gray Microsoft Research Gray@Microsoft.com http://research.Microsoft.com/~Gray/. THE. LONG. BOOM. Federal Research Support Nerve Center of Science If it’s not broke, don’t fix it. But…. US Science is the engine of progress BUT…..
E N D
Telepresence: An Umbrella Research Topic Jim Gray Microsoft Research Gray@Microsoft.com http://research.Microsoft.com/~Gray/
THE LONG BOOM Federal Research Support Nerve Center of ScienceIf it’s not broke, don’t fix it.But…. • US Science is the engine of progressBUT….. • Best and brightest are spending increasing time fund raising • Seems excessive to me. • Venture capital community is richer and more generous than Federal Support
THE LONG BOOM Cyberspace is a New World. • We have discovered a “new continent”. • It is changing how we learn, work, and play. • 1 T$/y industry • 1 T$ new wealth since 1993 • 30% of US economic growth since 1993 • There is a gold rush to stake out territory. • But we also need explorers: Lewis & Clark expeditions Universities to teach the next generation(s) • Governments, industry, and philanthropists should fund long-term research.
1960 1970 1980 1990 Time-sharing CTSS, Multics, SSDUnix SDS 940, 360/67 VMS Government funded Industrial Billion Dollar/year Industry Graphics Sketchpad, Utah GM/IBM, LucasFilm E&S, SGI, PIXAR,.. Networking Arpanet, Internet Ethernet, Pup, Datakit DECnet, LANs, TCP/IP Workstations Lisp machine, Stanford Xerox Alto Apollo, Sun Windows Englebart, Rochester Alto, Smalltalk Star, Mac, Microsoft Research Investments Pay OffCSTB –NRC Evolving the High-Performance Computing and Communications Imitative to Support the nations Information Infrastructure, NA Press, Washington DC, 1995.
1970 1980 1990 2000 Research Investments Pay Off Relational Data Bases Berkeley, Wisc,… IBM Oracle, IBM,… Parallel DBs Tokyo,Wisconsin, UCLA ICL, IBM ICL, Teradata, Tandem Wisc, Stanford, … IBM, Arbor,… IRI, Arbor, Plato, … Data Mining (complex queries)
Why Can’t Industry Fund IT Research? • It does: IBM (5.8%), Intel(13%), Lucent (12%), Microsoft(14.%), Sun (12%), ... • R&D is ~5%-15% (50 B$ of 500 B$) • AD is 10% of that (5 B$) • Long-Range Research is 10% of that 500 M$2,500 researchers and university support • Compaq: 4.8% R&D (1.3 B$ of 27.3 B$).AOL: 3.7% D, ?R (96 M$ of 2.6 B$) • Dell:1.6% R&D (204 M$ of 12.6 B$), EDS, MCI-WorldCom, …. • To be competitive, some companies cannot make large long-term research investments. The Xerox/PARC story: created Mac, Adobe, 3Com…
PITAC ReportPresidential IT Advisory Committeehttp://www.ccic.gov/ac/report/ • Findings: • Software construction is a mess: needs breakthroughs. • We do not know how to scale the Internet 100x • Security, manageability, services, terabit per second issues. • USG needs high-performance computing (Simulation)but market is not providing vector-supers – just providing processor arrays. • Trained people are in very short supply. • Recommendations: • Lewis & Clark expeditions to 21st century. • Increase long-term research funding by 1.4B$/y. • Re-invigorate university research & teaching. • Facilitate immigration of technical experts.
Outline (ambitious!) • Microsoft Research (census) • Tele-Presentations (Gordon Bell, Jim Gemmell) • Microsoft Research initiative on Telepresence • What if you could record everything you see & hear? • The architecture revolution: processing moves to transducers
Microsoft Research -- 1991 • Founded in 1991 • Goal: pursue strategic technologies for Microsoft • Original research groups: • Natural Language Processing • Operating Systems • Programming Languages • Overall size < 20 at the end of 1992
Microsoft Research -- 1999 • 400 Researchers in 25 areas • Operating systems to Statistical Physics • Research lab locations: • Redmond, Cambridge, San Francisco, Beijing • Internationally recognized research teams • Hundreds of publications, presentations • Leadership roles in professional societies, journals, conferences
MS Research Areas • Operating systems, languages, compilers, virtual machines, networking, wireless computing, fault-tolerance, large scale servers, security • Natural language, speech, vision, graphics, decision theory, information retrieval, UI, collaboration, statistics, signal processing • Cryptography, statistical physics and discrete mathematics
Growing Fast • Grew 20x from ‘92 to ‘99 • Decided in ‘97 to grow by a 3x in 3 years • 200 in FY97 => 600 in FY00, primarily in Redmond • Major impact on MS products • Virtually all MS products shipped today use technology from MS Research • Key role in MS growth • Pioneering research in software that allows computers to see, hear, speak and understand
Microsoft Research Philosophy • University organizational model • Flat structure, critical mass groups • Open research environment • Aggressive publication of research results in literature and on world wide web • Frequent visitors, daily seminars • Over 100 visiting professors and interns in 1998 • Over 110 visiting researchers in 1998
What I Do. • Work for the government! • CSTB, PITAC(software, ngi), LoC study, .... • Work on scaleable systems: • 1 Billion Transactions Per Day Cluster • TerraServer • New: Sloan Digital Sky Survey
Outline (ambitious!) • Microsoft Research (census) • Tele-Presentations (Gordon Bell, Jim Gemmell) • Microsoft Research initiative on Telepresence • What if you could record everything you see & hear? • The architecture revolution: processing moves to transducers
Gordon Bell on Tele Presentations http://research.microsoft.com/barc/GBell/
Motivation:Telepresentations • Presenter and/or audience telepresent • NOT: meeting or collaboration settings • Forget the nasty social issues! Mostly one-way
TelepresentationElements • Slides • Audio • Video • Script, text comments, hyperlinks,etc.
Telepresentations:The Essentials • Slide and audio a must • Add some video (low quality) to make us feel good • Storage and transmission costs low
Telepresentations:The Killer App • Increased attendance & lower travel costs • Practical and low-cost NOW • e.g. ACM97 - 2,000 visitors in real space, 20,000 visitors on Internethttp://research.microsoft.com/acm97
Today’sExperiment • Would you like to pause, rewind, browse? • Do you wish you could have seen this • At home? • At another time? • How much does a present speaker add? How much would you pay for real presence?
University Lectures Online • Research lectures on-line & on-demand • http://murl.microsoft.com/ • Will get UVC content • Available to anyone anywhere • T1 good, 28.8 OK • Generated by CMU, MIT, MSR, Stanford, UW, Xerox • Hosted by MSR
Outline (ambitious!) • Microsoft Research (census) • Tele-Presentations (Gordon Bell, Jim Gemmell) • Microsoft Research initiative on Telepresence • What if you could record everything you see & hear? • The architecture revolution: processing moves to transducers
Changing role of computation • Past: Computers for: • computing (Cray) • business data processing (IBM) • “document” creation (PC) • Future: Computers for: • understanding & learning • communicating • consuming & entertaining • Requires new User Interface to machines
Making “Flows” a Reality • Computer Graphics • Creating realistic looking environments, people • Computer Vision • Analyzing posture, gaze, gestures • Speech input/output • Natural Language • Analysis, IR • Implicit requests for information
How to fail at Tele-Conferences • Eliminate gaze awareness and sense of space of a normal group setting • Have long audio latencies & poor audio quality • Use incompatible equipment • Make it much harder to initiate the call to make a phone call
Gaze Awareness & Sense of Space • Is anyone paying attention? • Who is talking (where is sound coming from?
Gaze Awareness • Looking at screen:the forehead shot • Looking at camera:the glowering shot • Looking at YOU.
You can’t just move the eyes • Glowering • Surprise • Boredom • Interest
Mona Lisa Effect • Eyes and nose indicated gaze
Spatialized Audio & VideoPointing “nose vector” at target • Map video onto wire frame • Rotate frame to point in space • Move (fake) eyes in frame (>30°)to point at target • Project voice on that vector.
Area of motion Live video H flow V flow Recognizing gestures
Generating life-like speech from textual data • Data-driven stochastic speech • Natural sounding • Rapid, automatic customizability • Examples • Synthetic voice w/ transplanted speech contours
Artificial singing • AT&T Voder, 1962, by Homer Dudley • Daisy (Inspiration for HAL’s voice in 2001) • Microsoft Research Whistler, 1997 • Scarborough Fair
Analyzing language • Language recognition shipped in Word 97 • General purpose text-critiquing, summarization, Japanese word-breaking
Understanding language: MindNet • A huge language knowledge base • Automatically created from dictionaries • Words (nodes) linked by relationships • Millions of links • Recently added (Encarta) encyclopedia knowledge
MindNet -- “Going to the birds” supply poultry clean smooth keep duck meat preen quack plant chatter animal creature bird sound feather gaggle goose limb peck Is_a claw beak hawk strike fly leg turtle catch arm bill opening face mouth chicken Is_a Typ_obj Purpose Is_a Quesp Typ_0bj_of hen Is_a Is_a Typ_obj Purpose Cause Typ_subj Is_a egg Means Not_is_a Typ_subj Is_a Is_a Is_a Is_a Is_a make Typ_obj Part Is_a Is_a wing Is_a Is_a Typ_subj_of Means Is_a Is_a Part Part_of Is_a Typ_obj Typ_subj_of Is_a Is_a Typ_subj Locn_of Is_a
Changing balance between user & software systems • Yesterday: • Applications were single programs running in isolation • Users used to (more or less) understand systems that they used • Today: • Componentized applications operate in concert • Sophisticated users understand only small percentage of systems they use
Tomorrow’s Systems and Applications • Users will not be able to predict • where computations will be performed, • when they will be performed or • by what software components • Gap between system capabilities and user understanding will grow to the point that the only way user will be able to use system is through assisting agents
Examples of user agents & implicit actions • Lumiere (Office 97) • Monitoring user and program events to provide user help and assistance • Implicit queries • Inferring information needs from browsing • Lookout/SpamKiller • Monitoring mail activity to auto-categorize it
User Modeling • Models of a user’s informational goals • User’s query (when available…) • User’s background • Acute and long-term search activity • Acute actions with objects and documents • Program data structures • Explicit and implicit information access and display
Outline (ambitious!) • Microsoft Research (census) • Tele-Presentations (Gordon Bell, Jim Gemmell) • Microsoft Research initiative on Telepresence • What if you could record everything you see & hear? • The architecture revolution: processing moves to transducers
Kilo Mega Giga Tera Peta Exa Zetta Yotta A letter A novel A Movie Library of Congress (text) LoC (image) LoC (sound + cinima) All Photos All Disks All Tapes All Information!
Alan Newell’s & Michael Lesk’s Pointswww.lesk.com/mlesk/ksg97/ksg.html • Soon everything can be recorded and kept • Most data will never be seen by humans • Precious Resource: Human attention Auto-Summarization Auto-Searchwill be a key enabling technology.
Outline (ambitious!) • Microsoft Research (census) • Tele-Presentations (Gordon Bell, Jim Gemmell) • Microsoft Research initiative on Telepresence • What if you could record everything you see & hear? • The architecture revolution: processing moves to transducers
Put Everything in Future (Disk) Controllers(it’s not “if”, it’s “when?”)Acknowledgements:Dave Patterson explained this to me a year agoKim KeetonErik RiedelCatharine Van Ingen Helped me sharpen these arguments