40 likes | 53 Views
Important, but not grand challenges, for the database community. Michael Pazzani Division Director Information and Intelligent Systems NSF. Challenge One. Information Integration: Making “join” and “select” work in the real world.
E N D
Important, but not grand challenges, for the database community Michael Pazzani Division Director Information and Intelligent Systems NSF
Challenge One • Information Integration: Making “join” and “select” work in the real world. “We would have liked to determine whether the success rate for CAREER awards at undergraduate institutions was lower than at research institutions, but that information wasn’t in the database*” * It was in two databases, but 3 CS Ph.D.s on a 3.06GHz dual processor Pentium with 200GBs of storage connected to 10Gbps network with an 802.11g card couldn’t figure out how to integrate the information without using a pencil. p.s. 99% of the information on the computer wasn’t in a structured database. What if it were a spreadsheet and a web page?
Challenge Two Change the way the rest of the CS community and scientific community talks and therefore thinks of databases. Data Storage Data Archive Data Repository Data Dump
What’s wrong with “storage” • It stresses the writing of data, which isn’t important if you aren’t going to retrieve and analyze data. • Tangent: In spite of the terminology, while many people access databases daily, few actually know how to create them. There’s a very large gap between mySQL and storing Time Series, 2-D Image, 3-D image, Video, 3-D models, sequence, graph, and geospatial data. • Summary: Many database issues will be continue to be de-emphasized if your colleagues think of adding “storage” to their projects.