160 likes | 303 Views
Using Data Versioning in Database Application Development. Ramkrishna Chatterjee, Gopalan Arun , Sanjay Agarwal, Ben Speckhard , and Ramesh Vasudevan. David Lafontant. Presenter. introduction. D atabase application code runs in one or more tiers within a multi-tiered architecture.
E N D
Using Data Versioning in Database Application Development Ramkrishna Chatterjee, GopalanArun, Sanjay Agarwal, Ben Speckhard, and Ramesh Vasudevan David Lafontant Presenter
introduction • Database application code runs in one or more tiers within a multi-tiered architecture. • Techniques and tools that aid in the development and testing of database applications are extremely important. • However, testing of database applications has not been adequately addressed in the software engineering literature.
Database Testing difficulties • Multiple developers need to use the same database • Parallelism • Easy cleanup of changes after testing
A possible solution: versioning • Support multiple logical databases within one physical database. • Transparent implementation: transparent: No changes should be required in the database application code to accommodate data versioning. • Should not impact widely used database functionalities, such as triggers and constraints. • Any changes should be easily/efficiently reversible.
Long Transaction Framework • A transaction is a unit of work that is • Atomic • Consistent • Isolated • Durable • Short transactions use pessimistic locking • Exclusive locks on read/write until end of operation • Two transactions cannot operate on same data concurrently
Long Transaction Framework • What about longer transactions? • Database applications such as Computer Aided Design (CAD) that run for days and a transaction can run for hours, days, weeks, or even months. • Long Transactions (in this paper) should use optimistic approach • Allow multiple transactions to modify same data. • Conflict resolution at COMMIT stage.. • Relaxes Atomic property of a transaction. • Long transactions should be broken down into sub transactions • Due to their long nature, this can benefit performance. A child commits into their parent transaction. • Developers can work within their own long transaction in isolation.
Long transaction management using row level versioning Version Hierarchy • Table Key is augmented with version number • System Version Tree • Version Set • Elements of version set form a path from parent node to child leaf node. • A new long transaction within a long transaction results in two new leaf nodes. • Root transaction
Long transaction management using row level versioning Hierarchical long transaction isolation • VersionView(T) • Created on a version supporting table T. • Stores long transaction ID • Shows data only from a long transaction’s version hierarchy. • Shows most recent version. Ex: In {V1, V3, V6, V9} set, only V9 is shown.
Long transaction management using row level versioning Hierarchical long transaction isolation • VHRows(T) • Row ‘r’ is an element of T given: • R’s version number is in the current leaf version • R’s ‘isDeleted’ property is set to ‘NO’. • OverrideSet(r) • Row ‘r1’ is an element of T given: • For all keys in row ‘r’, the key is the same as ‘r’ • ‘r1’ version number is current • ‘r1’ is the most recent version of a row
Long transaction management using row level versioning SQL Transparency • Users of versioning should not need to add additional functionality on their own. • VersionView is renamed to the same name as the table it is a version of. • Triggers are used to implement a copy-on-write approach. • new copy of row having the same key is tagged with long transaction’ s leaf version number. [Override set]
Long transaction management using row level versioning ROLlback of versions • Rolling back versions is simple • Delete the versions of rows marked with most recent version. • B tree Index on version number helps make this happen quickly
Implementation and Tests • Implemented in Oracle Workspace Manager (OWM) • Experiment 1 measured the reduction in initialization time achieved by using long transactions. In the absence of long transactions, the initialization time is the time taken to drop, create and populate all the tables in the initial test data set. • [Figure 6 and 7] • Without long transactions, initialization time was proportional to the size of the initial data set • Experiment 2 measuredthe increase in parallelism achieved by using long transactions • [Figure 8 and 9] • Without long transactions, initialization of tests could not be done in parallel. • Also, number of developers would involve working with copies of the data whereas long transactions allow each developer to access the same rows.
Conclusion • Long transactions have a demonstrable benefit in database performance • Future Work • To exploit the intermediate save point feature built into long transactions to improve debugging with different versions of the same database. • Storing different versions of test data with long transaction object hierarchy feature • Database schema versioning
Reference • Chatterjee, Ramkrishna, et al. "Using data versioning in database application development." Software Engineering, 2004. ICSE 2004. Proceedings. 26th International Conference on. IEEE, 2004.