300 likes | 312 Views
Learn how to build portable and configurable I/O appliances using commodity systems for network storage technologies. Case studies and storage modules are explored.
E N D
NeST: Network Storage Technologies Building I/O Appliances on Commodity Systems John Bent, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau and Miron Livny http://www.ioappliance.com
Outline • Introduction • Case studies • Storage modules • Conclusion
Problem Statement Appliances are attractive because they are robust, reliable, available and especially because they are easy to use. To fulfill these criteria, traditional network appliances impose policy decisions on their users and are built either as kernel modules or upon specially designed kernels. “How to build portable, configurable I/O appliances?”
Goal To create a network-storage “template” that produces a range of I/O appliances according to the storage needs of the target application and any constraints of the host system. Perfect I/O Appliance Target App Network Storage Technologies Host System
Host system constraints • Thread support • Raw disk access • Select interface
Target app. storage needs • Invariant and variant storage needs • Invariant • Reliable • Low latency • High bandwidth • Easy to administer • Cheap
Target app. storage needs • Variant • Write concurrency • Replacement costs • Security and authentication needs • Communication protocol • Transfer unit
Outline • Introduction • Case studies • Storage modules • Conclusion
Building I/O appliances • Four case studies • ReqEx • WiND • Web proxy cache • Condor checkpoint server
Queue of Reqs Huge tape library (terabytes) Tape Robot What is ReqEx? ReqEx Staging Area A robot moves archived data one tape at a time to a temporary staging area.
Condor Manager Perfect I/O Appliance Compute cluster What is ReqEx? WAN ReqEx Staging Area Data is transferred and stored locally to facilitate access by compute nodes.
ReqEx variant storage needs • Write concurrency • No write (or read) concurrency • Replacement costs • Tape robot is very slow; objects cannot be lost • Security and authentication needs • Only owner can remove object • Protocol • ReqEx can be linked with NeST client library • Transfer unit • Whole object transfers only
WiND variant storage needs • Write concurrency • No write concurrency • Replacement costs • Unknown • Security and authentication needs • Unknown • Protocol • Predefined specific WiND protocol • Transfer unit • Disk blocks are accessed directly
Local Area Network What is a web proxy cache? Internet Frequently accessed objects can be stored locally to decrease request latencies. Perfect I/O Appliance
Cache variant storage needs • Write concurrency • No write concurrency • Replacement costs • Negligible • Security and authentication needs • None • Protocol • HTTP • Transfer unit • Whole object transfer only
A condor job runs on an execute machine. Keyboard activity causes the job to be evicted. A snapshot of the process is sent to the checkpoint server. Perfect I/O Appliance When the job migrates to another idle machine, the checkpoint file is recovered and progress resumes. What is Condor ckpt server?
CCS variant storage needs • Write concurrency • No write concurrency • Replacement costs • The running time of the job (could be months) • Security and authentication needs • Unauthorized access cannot be allowed • Protocol • Can link with NeST client library • Transfer unit • Whole file transfer only I see you’re discussing checkpointing. Don’t forget about incremental.
Outline • Introduction • Case studies • Storage modules • Conclusion
Storage modules Protocols Name Space Static Configuration Administrative Interface Concurrency Architectures Runtime Adaptation Storage Management Data Semantics
Configurable Components • Concurrency architecture • Data semantics • Protocol layer • Namespace • Security and authentication • Storage management
NOB POT POP Concurrency architecture “How can multiple storage requests be interleaved to maximize system throughput?” Easy ... but uninteresting.
Data semantics • Must stored objects be protected from concurrent writes? • Is transaction support necessary? • What are the recovery costs for lost objects?
Protocol layer • Most applications can not link with NeST client libraries • Most applications have their own specific communication protocols “How can a protocol layer easily communicate with arbitrary networking protocols?” Tower of Babel
Namespace • Flat • Hierarchical “How do clients uniquely identify their stored objects?”
Security and authentication • Ownership • Privacy • Encryption • Authentication • Access rights
Storage management • Native filesystem • Raw disk access • Uninteresting from client perspective
Outline • Introduction • Case studies • Storage modules • Conclusion
Conclusions and future work • Conclusions • None • Future work • Lots Maybe you should try a little harder.
Conclusions and future work • How to most easily identify the variant storage needs of the target application? • Config file? • Installation script? • Run-time monitoring? • How to ensure that performance is at least as good as an appliance specifically designed for the target application?