370 likes | 478 Views
NeST: Network Storage . Flexible Commodity Storage Appliances John Bent, Miron Livny, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau. Terms. Appliance (Merriam-Webster) b : an instrument or device designed for a particular use; specifically a household or office device Storage appliance
E N D
NeST: Network Storage Flexible Commodity Storage Appliances John Bent, Miron Livny, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau
Terms • Appliance (Merriam-Webster) • b : an instrument or device designed for a particular use; specifically a household or office device • Storage appliance • Storage plus access methods
What storage users want • Reliability and availability • Manageability • cost of management > cost of storage itself • “no futz” computing • Scalability • Performance
What storage vendors have • NetApp, EMC, others make storage appliances (network-attached storage) • Manageable • Just plug it in and it works • Administrative web interface • Reliable and available • Standard RAID techniques • High performance • Specialized, thin OS focused on serving files
What storage vendors get,annual revenues NetApp $800 million in 2000 EMC $9 billion in 2000
What’s the problem? • False coupling between HW and SW • “Playground syndrome” • Myth of specialization
H/W and S/W are bundled • Hardware decisions are imposed • Hard to ride commodity curve • Example: • Netapp F720 • $35,000.00, 252 GB • $138 / GB • Maxtor DiamondMax • $279.00, 80 GB • $3.50 / GB
“Playground syndrome” • “We have storage appliances . . . • if you use these protocols, • if you use these security mechanisms, • if you are comfortable with our data semantics” • Non-flexible software entity
Myth of specialization • Specialize for one protocol on one machine • Specialization decreases over time as • Protocols are added • Product line expands • Example: Netapp software • Generation 1 fit on a single floppy • Generation 2 took six • Generation 3?
Alternatives? • Appliance (Merriam-Webster) • a : a piece of equipment for adapting a tool or machine to a special purpose
Our game? • Flexible, commodity based, software-only storage appliances • Goal • Find a networked machine • “Drop” some software on it • Have a ready to use storage appliance with flexible mechanisms
New worlds, new problems • Diverse hardware, software platforms • Netapp, EMC advantage • fewer platforms, control over OS • Our approach • Automate configuration to each host system • Hardware example - use file system or self-manage • Software example - use either read/write or mmap • Cost of flexibility • Key is design of the software
Outline • Introduction • Building flexible storage modules • Big picture • Protocol layer • Concurrency architecture • Storage layer • Motivations for flexible storage appliances • Conclusion and current status
NeST structure • Cleanly separated modules for communication, transfer and storage • Protocol layer • Maps diverse protocols into common control flows • Concurrency architectures • Different models to maximize system throughput • Storage layer • Provides abstract interface to disks
GFTP NeST WiND HTTP NFS Transfer request Protocol Layer Storage Layer Concurrency Architecture Event driven Multi-process Multi-threaded Raw disk Local FS RAID NeST structure Central Control
NeST NFS HTTP NFSd HTTPd Operating system Operating system Protocol layer A collection of servers is less than the sum of their parts.
Consolidate protocols • Single point of control • Storage quotas and guarantees can be supported across multiple protocols. • Bandwidth can be controlled and quality of service can be guaranteed. • Single administrative interface • Set policies • Manage user accounts
Protocol layer implementation • Each protocol listens on well-defined port • Central control accepts connections • Protocol layer reads from connection and returns generic request object • Like Linux V-nodes • Add new protocol by writing a couple of methods
“31: LIST” FTP Central control Directory list “ftp, ftp, ftp” Directory list NeST speak “5” Linked list “nest, nest” Protocol layer Linked list Storage layer Protocol layer example,directory list request
Concurrency architecture • Three difficult goals • Low latency • High bandwidth • Multiple simultaneous clients • No single portable solution • Provide multiple models to provide solutions on a range of different platforms • Multi-threaded • Multi-process • Event driven
Concurrency architecture Event driven Multi-process Multi-threaded Concurrency architecture • Central control creates transfer object • Socket descriptor from the protocol layer • File descriptor from the storage layer • Transfer object passed to concurrency architecture
Storage layer • Three needed areas of flexiblity • File systems interfaces • Example: read()/write() or mmap() • Abstract storage models • RAID, JBOD, etc. • User account administration • Creation and removal • Quotas and guarentees for users and groups
Outline • Introduction • Building flexible storage modules • Motivations for flexible storage appliances • Conclusion and current status
Clients have different needs • Communication protocols • Replacement costs • Data semantics • Security and authentication
Communication protocols • The Esperanto problem • Too many protocols to implement them all • Too many clients use proprietary protocols Storage must allow pluggable protocols.
Replacement costs • Infinite cost to replace first class data. • Variable cost to replace cached data depending on size and distance. • Variable cost to replace job output files depending on computation cost. First class data Cheap cached files Cost aware storage can effectively increase its own capacity.
Data semantics • Must stored objects be protected from read and write dependencies? • Is transaction support necessary? • Acceptable replies to storage requests.
Data semantics, example • Problem • PFS on top of FTP fakes open • read may then return file not found • Solution • Mechanisms are needed to support flexible semantics independent of the transfer protocol. Divorce semantics from the protocol.
Security and authentication • Ownership • Privacy • Encryption • Authentication • Access rights
Promiscuous Abstinent Who, when, how and how much? • Who is allowed to use the storage? • Promiscuity and monogamy are easy • Polygamy is also easy
Do I know you? • Problem • Migrant grid users may need temporary, preferential storage access • Solution • Provide mechanisms to • advertise available storage • create self-destructing user accounts Matchmake applications with storage opportunities.
Outline • Introduction • Building flexible storage solutions • Motivations for flexible storage appliances • Conclusion • Current status • Future work • Concluding remarks
Current status • Concurrency architectures are done • Gets, puts, reads and writes perform well • Virtual protocol class interface is built • NeST speak is fully implemented • Grid ftp coming soon!! • Simple first implementation of storage reservations and remote quota management is done • Venkateshwaran Venkataramani
Future work • Discovery process of client storage requirements • Quality of service guarantees for bandwidth and storage • Support for transient and opportunistic users
Concluding remarks • Return storage to the commodity curve by creating software-only storage appliances • Allow greater storage flexibility for a wide range of application needs