200 likes | 406 Views
Two Techniques For Improving Distributed Database Performance. ICS 214B Presentation Ambarish Dey Vasanth Venkatachalam March 18, 2004. Issues In Distributed Databases. fast communication among clients data requested by a client can be located and transferred quickly
E N D
Two Techniques For Improving Distributed Database Performance ICS 214B Presentation Ambarish Dey Vasanth Venkatachalam March 18, 2004
Issues In Distributed Databases • fast communication among clients • data requested by a client can be located and transferred quickly • good utilization of client CPU and memory resources • removing I/O bottlenecks • reducing disk accesses • reducing communication with servers • increased scalability
Focus Of This Talk • two approaches for improving performance of distributed systems • client server caching (Franklin and Carey) • fast page transfer schemes (Mohan and Narang) • shared disk architecture • similarities
Client Server Caching • caching of data and locks at multiple clients • minimizes communication overhead between clients and servers • reduces contention for server resources • reduces contention for data • increases autonomy of clients
Existing Techniques • existing techniques for distributed data management fall into three categories • techniques that avoid caching • techniques that cache data but not locks • optimistic 2 phase locking • O2PL-Invalidate (O2PL-I) • O2PL-Propagate(O2PL-P) • O2PL-Dynamic (O2PL-D)
Novel Techniques • callback locking • an alternate method of maintaining cache consistency • adaptive locking • a protocol that improves upon O2PL-D
Callback Locking • supports caching of data pages and non-optimistic caching of locks • locks obtained prior to data access • server issues ‘call-back’ for conflicting locks • no consistency maintenance operations in the commit phase
Techniques For Callback Locking • callback read (CB-Read) • caches only read locks • lock issued only after completion of all the call-backs • on commit pages are sent back to server, but copies and hence a read lock is retained at the client • callback all (CB-All) • write locks are cached in clients rather than read locks • information about exclusive copies is stored at the client • server issues downgrade requests when it gets read lock requests for a page
Novel Techniques • callback Locking • adaptive locking
The New Adaptive Heuristic • the variety of the O2PL algorithms try to optimize the actions that they perform on the remote sites, once a lock has been obtained. • propagate pages only when • the page is resident at the site when the consistency operation is attempted • if the page was previously propagated to this site, and it has been re-accessed since then • the page was previously invalidated at the site and that invalidation was a mistake
Where We Are • client server caching • fast page transfer schemes • shared disk architecture • comparisons
Motivation • disk based data sharing involves a lot of overhead • system A wants to access a page owned by system B. • GLM sends B a lock conflict message • B writes the page to disk after forcing its logs (WAL) • B sends GLM a message to downgrade its lock, allowing A to read the page • A reads the page from disk • cost is 2 I/Os, 2 messages, and a log force
Alternative: Fast Page Transfer • systems transfer pages through message passing, rather than disk I/Os. • improves performance • requires buffer coherency protocols • requires special recovery protocols • what if a message is lost? • what if one or more systems fail? • four schemes for fast page transfer • medium, fast, superfast schemes
SuperFast Page Transfer • pages transferred from one system to another without writing them or their logs to disk • the final owner is responsible for writing the page to disk and ensuring that logs of all updates by all systems written to disk • cost is 0 I/O and 3 messages • how to deal with system failures? • how to preserve write-ahead logging?
Recovery • uses a merged log of all systems that have updated the page • recovery LSN (RLSN) is the earliest point in the merged log from which redo processing for a page has to start • initialized to HIGH (no recovery needed) • changed to the next LSN value when a page is locked in update mode • reset to HIGH after the updated page is written to disk • global lock manager adjusts RLSN value as it receives information from the systems
Single System Failure • locking information preserved at the GLM • a single system responsible for merging logs and doing REDO processing for all pages on behalf of all failed systems • pages requiring REDO are those locked in U mode and whose RLSN < HIGH • the minimum of these RLSN values is the starting point in the merged log for the REDO pass • ARIES style REDO, followed by UNDO • if LSNlog > LSNpage, reapply the log
Complex System Failure • the GLM crashes and at least one LLM crashes, so locking information is lost • each system periodically checkpoints the global lock manager’s state • write a Begin_GLM_Checkpoint log record • request <pageID, RLSN> for all pages with RLSN not equal to HIGH • write these into an End_GLM_Checkpoint log record
Complex System Failure • find the minimum RLSN contained in the End_GLM_Checkpoint log record • start REDO processing at this RLSN, or at the LSN of the Begin_GLM_Checkpoint log record, if all pages have RLSN of HIGH. • continue until end of log reached • undo processing done by individual systems
Preserving WAL • pages contain slots for attaching log information • <systemID, LSN> • when transferring a page, a system piggybacks the LSN of the latest log record it hasn’t written to disk • the final owner reads the slots and enforces WAL
Conclusion • the page transfer schemes incorporate ideas from client server caching for buffer coherency • central server maintains LSN information and transactions update this information when they commit • lock degradation • caching and fast page transfer can coexist, but both share tradeoffs • overhead of maintaining cache/buffer coherency • overhead of recovery protocols