140 likes | 298 Views
Replay Debugging for Distributed Application. D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo. Outline . Introduction Design Challenges Limitations Evaluation Related Work Conclusion. Introduction. Goal
E N D
Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo
Outline • Introduction • Design • Challenges • Limitations • Evaluation • Related Work • Conclusion
Introduction • Goal • Find non-deterministic failures in deployed, distributed applications • Motivation • Growth of distributed applications • Limitations of existing tools • Network inconsistency • Inadequacy of simulations • Reproduction difficulty
Introduction • Deterministic Replay • Remote Debugging latency • Continuous interaction • Connection problems • Continuous logging • Performance concerns • Consistent Group Replay • Multiple snapshots • Mixed Environment • Determine (non-)cooperating peers
Introduction • Liblog • Provides consistent replay in mixed env. • No Additional Hardware or patches • Works on unmodified C/C++ application • Simple • Startup script • GDB interface
Design • Shared Library Implementation • Intercepts calls to libc and vice versa • Less complicated • Message Tagging and Capture • Log messages • Time stamps • Central Replay • Local replay • Network bandwidth, matching h/w, data accessibility
Challenges • Multi-threaded applications • P.-Shared memory • S.-Implement new scheduler • Illegal memory accesses • P.-Heap/Stack corruption • S.-Zero out memory* • TCP Limitation • Querying for non-cooperating peers • GDB uniprocess restriction
Limitations • Log storage • Host Requirements • Scheduling semantics • Network overhead • Limited consistency • Completeness • Soundness
Evaluation • Experiments • Dual 3.06Ghz, Pentium 4 Xeon, 512K L2 cache • 2GB of RAM, 80 GB 7500 rpm ATA/100 disk • Broadcom 1000TX gigabit Ethernet
Conclusion • Related Work • Liblog is similar to several others (DejaVu, Jockey, Flashback) • Useful for select applications • Needs a lot of enhancements
Ideas/Issues • Useful for simulations • Restricted to none resource intensive applications. • No significant comparison • How long can logging occur for? 4MB/hr • Inadequate citations/references