1 / 48

NeST: Network Storage Flexible Commodity Storage Appliances

NeST: Network Storage Flexible Commodity Storage Appliances. Terms. Appliance (Merriam-Webster) b : an instrument or device designed for a particular use; specifically a household or office device Storage appliance Storage plus access methods. What storage users want.

sanam
Download Presentation

NeST: Network Storage Flexible Commodity Storage Appliances

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NeST: Network StorageFlexible Commodity Storage Appliances

  2. Terms • Appliance (Merriam-Webster) • b : an instrument or device designed for a particular use; specifically a household or office device • Storage appliance • Storage plus access methods

  3. What storage users want • Reliability and availability • Manageability • cost of management > cost of storage itself • “no futz” computing • Scalability • Performance

  4. What storage vendors have • NetApp, EMC, etc. • Manageable • Just plug it in and it works • Administrative web interface • Reliable and available • Standard RAID techniques • High performance • Specialized, thin OS focused on serving files

  5. What storage vendors get,annual revenues NetApp $800 million in 2000 EMC $9 billion in 2000

  6. What’s the problem? • False coupling between HW and SW • “Playground syndrome” • Myth of specialization

  7. H/W and S/W are bundled • Hardware decisions are imposed • Hard to ride commodity curve • Example: • Netapp F720 • $35,000.00, 252 GB • $139 / GB • Linux server • $18,000.00, 365 GB • $49 / GB

  8. “Playground syndrome” • “We have storage appliances . . . • if you use these protocols, • if you use these security mechanisms, • if you are comfortable with our data semantics” • Non-flexible software entity

  9. Myth of specialization • Specialize for one protocol on one machine • Specialization decreases over time as • Protocols are added • Product line expands • Example: Netapp software • Generation 1 fit on a single floppy • Generation 2 took six • Generation 3?

  10. Alternatives? • Appliance (Merriam-Webster) • a : a piece of equipment for adapting a tool or machine to a special purpose

  11. Our game? • Flexible, commodity based, software-only storage appliances • Goal • Find a networked machine • “Drop” some software on it • Have a ready to use storage appliance with flexible mechanisms

  12. New worlds, new problems • Diverse hardware, software platforms • Netapp, EMC advantage • fewer platforms, control over OS • Our approach • Automate configuration to each host system • Hardware example - use file system or self-manage • Software example - use either read/write or mmap • Cost of flexibility • Key is design of the software

  13. Outline • Introduction • Building flexible storage modules • Big picture • Protocol layer • Concurrency architecture • Storage layer • Motivations for flexible storage appliances • Conclusion and current status

  14. NeST structure • Cleanly separated modules for communication, transfer and storage • Protocol layer • Maps diverse protocols into common control flows • Concurrency architectures • Different models to maximize system throughput • Storage layer • Provides abstract interface to disks

  15. GFTP NeST WiND HTTP NFS Protocol Layer Concurrency Architecture Storage Layer Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Control Logic

  16. GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Control Logic

  17. GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Control Logic

  18. GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Control Logic

  19. GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Control Logic

  20. GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Control Logic

  21. Protocol layer A collection of servers is less than the sum of their parts. GFTPd HTTPd Operating system

  22. NFS Operating system Protocol layer A collection of servers is less than the sum of their parts. NeST GFTPd HTTPd GFTPd HTTPd Operating system

  23. Consolidate protocols • Single point of control • Storage quotas and guarantees can be supported across multiple protocols. • Bandwidth can be controlled and quality of service can be guaranteed. • Single administrative interface • Set policies • Manage user accounts

  24. Protocol layer implementation • Each protocol listens on well-defined port • Central control accepts connections • Protocol layer reads from connection and returns generic request object • Like Linux V-nodes • Add new protocol by writing a couple of methods

  25. “31: LIST” FTP Control Logic NeST speak “5” Protocol layer Protocol layer example,directory list request Storage Layer

  26. “31: LIST” FTP Control Logic Directory list NeST speak “5” Protocol layer Protocol layer example,directory list request Storage Layer

  27. “31: LIST” FTP Control Logic Directory list NeST speak “5” Protocol layer Protocol layer example,directory list request Directory list Storage Layer

  28. “31: LIST” FTP Control Logic Directory list NeST speak “5” Protocol layer Linked list Protocol layer example,directory list request Directory list Storage Layer

  29. “31: LIST” FTP Control Logic Directory list NeST speak “5” Linked list Protocol layer Linked list Protocol layer example,directory list request Directory list Storage Layer

  30. “31: LIST” FTP Control Logic Directory list “ftp, ftp, ftp” NeST speak “5” Linked list “nest, nest” Protocol layer Linked list Protocol layer example,directory list request Directory list Storage Layer

  31. Concurrency architecture • Three difficult goals • Low latency • High bandwidth • Multiple simultaneous clients • No single portable solution • Provide multiple models to provide solutions on a range of different platforms • Multi-threaded, multi-process, event driven

  32. Storage layer • Three needed areas of flexiblity • File systems interfaces • Example: read()/write() or mmap() • Abstract storage models • RAID, JBOD, etc. • User account administration • Creation and removal • Quotas and guarantees for users and groups

  33. Outline • Introduction • Building flexible storage modules • Motivations for flexible storage appliances • Communication protocols • Replacement costs • Data semantics • Security and authentication • Condor NeSTs • Conclusion and current status

  34. Communication protocols • The Esperanto problem • Too many protocols to implement them all • Too many clients use proprietary protocols Storage must allow pluggable protocols.

  35. Replacement costs • Infinite cost to replace first class data. • Variable cost to replace cached data depending on size and distance. • Variable cost to replace job output files depending on computation cost. Cheap cached files First class data

  36. Replacement costs • Infinite cost to replace first class data. • Variable cost to replace cached data depending on size and distance. • Variable cost to replace job output files depending on computation cost. Cheap cached files First class data Cost aware storage can effectively increase its own capacity.

  37. Data semantics • Must stored objects be protected from read and write dependencies? • Is transaction support desired? • Acceptable replies to storage requests.

  38. Data semantics, example • Problem • PFS on top of FTP fakes open • read may then return file not found • Solution • Mechanisms are needed to support flexible semantics independent of the transfer protocol.

  39. Data semantics, example • Problem • PFS on top of FTP fakes open • read may then return file not found • Solution • Mechanisms are needed to support flexible semantics independent of the transfer protocol. Divorce semantics from the protocol.

  40. Security and authentication • Ownership • Privacy • Encryption • Authentication • Access rights

  41. Promiscuous Abstinent Who, when, how and how much? • Who is allowed to use the storage? • Promiscuity and monogamy are easy • Polygamy is also easy

  42. Do I know you? • Problem • Migrant grid users may need temporary, preferential storage access • Solution • Provide mechanisms to • advertise available storage • create self-destructing user accounts Matchmake applications with storage opportunities.

  43. Condor NeSTs • Better, smarter checkpoint servers • Checkpoints are just another data file • NeST transparently replicates and migrates data files • Condor jobs access data files from closest NeST • Flexible policy support for managing disk and network resources

  44. Compute cluster Compute cluster NeST NeST NeST Condor NeSTs,example ReqEx Tape Library ReqEx scheduler

  45. Outline • Introduction • Building flexible storage solutions • Motivations for flexible storage appliances • Conclusion • Current status • Future work • Concluding remarks

  46. Current status • Concurrency architectures are done • Gets, puts, reads and writes perform well • Virtual protocol class interface is built • NeST speak is fully implemented • Grid ftp is partially implemented • Simple first implementation of storage reservations and remote quota management is done

  47. Future work • Discovery process of client storage requirements • Quality of service guarantees for bandwidth and storage • Support for transient and opportunistic users • Transparent inter-NeST cooperation

  48. Concluding remarks • Return storage to the commodity curve by creating software-only storage appliances • Allow greater storage flexibility for a wide range of application needs

More Related