360 likes | 580 Views
Database Systems “Breaking Out of the Box”. Avi Silberschatz Stan Zdonik Bell Laboratories Brown University July 7, 1997. The Paper’s Theme (Strategic Directions).
E N D
Database Systems “Breaking Out of the Box” Avi Silberschatz Stan Zdonik Bell Laboratories Brown University July 7, 1997 Mehmet Uner
The Paper’s Theme (Strategic Directions) • Database Research should be devoted to the problems of data management no matter where and in what form the data might be found. • Database management skills should be applied to new data management environments that potentially require radically new software architectures. Mehmet Uner
Outline • Introduction • Background • Our Skills • Scenarios • Barriers • Research • Conclusions • References Mehmet Uner
Introduction • The field of database systems research and development has been very successful over its 30 year history. • It has led to $10 billion industry that touches virtually every major company in the world. • Unthinkable to manage large volume of valuable information that keeps corporations runing without support from commercial database management systems (DBMS). • DBMS is a very complex system incorporating a rich set of technologies. • Suited for solving problems of large-scale data management in the corporate setting. Mehmet Uner
DBMS DBMS Requirements: • Execution Overhead. • High level of expertise to install and maintain. • Only manages data in fairly specific file formats. Mehmet Uner
Solution At the same time: • Data is changing rapidly. • Data is stored in different places (e.g. files) • Data is obtained in large volumes from external sources like sensors. Solution: • Not full-blown DBMS, a lighter-weight solution • Instead of using an existing tool in a new application, it is better to embed reusable components. • Use database system components, techniques and experience in new ways. Mehmet Uner
Examples • Some examples that could benefit from data management techniques but that typically do not make heavy use of database products: • World Wide Web • Personal Information Systems (e-mail) • News Services • Scientific Applications Mehmet Uner
Background • Database field born with release of IMS in 60’s. • IBM Product • Managed data as hierarchies • Data has value, manage independently of application • Codasyl, most well known successor • Based on graph-based structure. • Ted Codd published a paper in 1970 • Suggested relational model. Mehmet Uner
Background • Object Oriented Principles in 80’s • Allow users to create their own application-specific types that can be managed by the DBMS. • Hybrid model in 90’s • Embeds object-oriented features in a relational context. Mehmet Uner
Our Skills • Database Management Systems have been concerned with the following problems: • High Performance • Correctness • Maintainability • Reliability • From point of view of slow-memory devices that must be shared by multiple concurrent users • This approach leads to a set of skills and techniques that can be applied and extended to other problems. Mehmet Uner
Skills and Techniques • Data Modeling • Language for defining structure of database • Language for manipulating those structures. • Query Languages • High-level language to retrieve data from the database. (SQL) • Query Optimization and evaluation • State-based views • Restricted and reorganized view of database. Mehmet Uner
Skills and Techniques • Data Management • Automatic maintenance of data structures • Efficient Movement of data • Transactions • A response to correctness problems introduced by concurrent access and update • Distributed Systems • Scalable Systems • Database systems have been tuned to efficiently and reliably handle data volumes that exceed the size of the the physical memory by several orders of magnitude. Mehmet Uner
Scenarios • The way for future data management systems • The technology that would support these scenarios constitutes a research agenda for the next decade. 1) Instant Virtual Enterprise 2) Personal Information Systems Mehmet Uner
Instant Virtual Enterprise • An “instant virtual enterprise” (IVE) is a group of companies, that do not routinely function as a unit. • Come together to respond to a customer order or request for proposal. • Computer integrated manufacturing (CIM) is an example of an environment requiring IVE cooperation. • Engineering side • Design, Production, Quality Assurance • Administrative side • Planning, Production Control, Resource Management Mehmet Uner
Instant Virtual Enterprise • Companies in IVE needs to exchange and manage large amounts of data • Companies will have many heterogeneous databases • Sharing and exchanging data with coordinating information is critical Mehmet Uner
Company A Company Q Company R Company S IVE Scenario Building an oil pipeline Engineering Firm (IVE) License their design Engineering Analysis Mehmet Uner
Company T Company U Company V Company W IVE Scenario Actual Fabrication Casting Design file conversion service Documentation and Archiving Mehmet Uner
IVE Scenario • Database Capabilities Needed: • Executing a query for the design • Data translation services for engineering analysis • Coordination and configuration management • Changes to an object in one subsystem require changes to one or more related objects in other subsystems. • Security and access control over the information • Archiving of information, even after the IVE disbands Mehmet Uner
Personal Information Systems Scenario • Provides information to an individual • Uses PID (Personal Information Device) • PDA • Handheld PC • Laptop • Equipped with wireless network connection • Access to internet Anywhere, Anytime. Mehmet Uner
Personal Information Systems Scenario • Tightly integrated with individual’s activities. From morning to bed time. • In the morning • Local Weather Report • List of Reminders • List of Morning Meetings • Best Route from home to work • Personalized Headlines • Personalized Investment Report Mehmet Uner
Personal Information Systems Scenario • Throughout the day • Tasks for the day • List of customers to contact • Summary of breaking news • Best Driving Routes in the city • At the end of the day • Next day’s activities • Appointments Mehmet Uner
Personal Information Systems Scenario • PID must continuosly query remote databases and monitor broadcast information • PID will magnify today’s client-server performance, scalibility and reliability problems • Where should data reside, PID or Server? Mehmet Uner
Barriers • DBMS provides a tightly controlled and highly uniform environment • For the new applications, database functionality should be provided outside of the limits of a DBMS. • For the vision represented in the scenarios, a number of technical barriers must be removed. Mehmet Uner
Barriers • Overhead • System requirements, expertise, planning, monetary cost • Builder of personalized newspaper service do not use DBMS because there is no need for many of the advanced features. • A subset of the traditional database services are needed by many new applications • Scale • Greater volume of data (petabytes) • Hundreds of servers, client population even larger Mehmet Uner
Barriers • Schema Organization • First create a schema to describe the structure of the database and populate the database • Many applications currently create data independently of a database system. (scientific applications, web sites) • Schema is incomplete or inconsistent. • Schema management facilities is needed to adapt the dynamic nature of foreign data. • Data Quality • Information accessed form a WAN may be of varying quality. • Future information systems must be able to react to the quality of the data source. Mehmet Uner
Barriers • Heterogeneity • Data exists in many forms • These dissimilar formats must be integrated to allow applications to access data in a high-level and uniform way • Query Complexity • Different characteristics in future environments • Conventional, minimize number of disk access • Future, minimize total “information bill” Mehmet Uner
Barriers • Ease of Use • Highly-trained, full-time staff is assumed to manage a DBMS • Yet most users have no training in database tech. • Simple set of interfaces needed. • Security • As the amount of shared information grows, the need to restrict access to specific users of for specific use arises. Mehmet Uner
Barriers • Guaranting Acceptable Outcomes • Transacation managemnet, a barrier to both system performance and ability to specify acceptable outcomes • New or enchanced transaction technology is needed • Making data unavaliable is not acceptable • Aborting transactions is unacceptable • Technology Transfer • Barrier between research and industry • Insufficient knowledge of each other Mehmet Uner
Research • In order to achieve the vision and overcome these barriers, a number of central research topics must be addressed: • Extensibility and Componentization • Imprecise Results • Schemaless Databases • Ease-of Use • New transaction Model • Query Optimization • Data Movement • Security • Database Mining Mehmet Uner
Research • Extensibility and Componentization • DBMS in a modular way • Lighter-weight applications • Imprecise Results • In the web search engines do not provide 100% accuracy • A general theory of imprecision must be developed • Schemaless Databases • Able to work with unstructured data Mehmet Uner
Research • Ease-of-use • Better database interfaces are required. • New transaction Models • Overcome blocking. • Provides Correctness. • Query Optimization • New indexing methods, query processing strategies. • Cheaper but slower response time. • Sensitive to bandwidth and power considerations. Mehmet Uner
Research • Data Movement • In a distributed environment, the cost of moving data can be extremely high • Asymmetric communication channels, (low bandwidth lines) • Security • Formulation of an authorization model • Interoperability between differen security policies • Database Mining • Machine Learning • Statistical Analysis • Database Technologies Mehmet Uner
Conclusions • Database research must be broadly defined. • Database community must apply its experience and expertise to new areas and new solution packet must be found. • The vision is an integration that supports the application of database functionality in small modules that give just the right capability. • These modules should also represent a unified theory of information that allows for the querying information of all types without having to switch languages or paradigms. Mehmet Uner
References • E. F. Codd, “A relational Model for Large Shared Databanks”, Communications of the ACM, 13:6,(June 1970), pp. 377-387. • J. Gray,http://www.cs.washington.edu/homes/lazowska/cra/database.html • A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems: Achievements and Opportunities,” SIGMOD Record, 19:4, pp.6-22. • A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems: Achievements and Opportunities Into the 21st Century”, http://www.cs.stanford.edu/pub/papers/lagii.ps • J. Toole and P. Young, http://www.hpcc.gov/cic/forum/CIC_Cover.html Mehmet Uner
Thanks! Any Questions? Mehmet Uner