High Performance Storage Service Virtualization

High Performance Storage Service Virtualization Scott Baker University of Arizona

What is Virtualization? Consider an existing client/server scenario: A virtualizer is inserted between the client to provide a better service:

Why Virtualization ? • Why not create a newer/better service ? • Modified clients / servers • Lengthy standardization process • Slow to integrate into existing infrastructure • Virtualization offers • Unmodified clients/server • No standardization requirements • Rapid integration

Types of Virtualizations • Mutation • Change a service into something different • Aggregation • 2+ servers  1 big server • Replication • 2+ servers  1 more reliable server • Fortification • Vulnerable server  more secure server

Mutation (Gecko) • Web uses HTTP protocol • Traditional programs use file system semantics • open, close, read, write, etc • Inconvenient to modify existing body of applications to use the web • WWW8, SP&E papers

Aggregation (Mirage) • Combine 2+ NFS file systems to create one big file system • Clients are unaware multiple servers exist • IEEE LCN paper

Replication (Mirage) • Unmodified primary server • Specialized backup server • Asymmetric design • Commercial primary, commodity backup • Logging

Fortification (Mirage) • Fortify a server against DoS attacks • Several ideas • Prevent faulty requests from reaching servers • Use scheduling to ensure fairness • Push authentication to the border of the network • Currently work in progress

Mirage in the Network • Mirage could be located in: • Client (patch OS, or user-mode daemon) • Server (patch app, or user-mode daemon) • Router (unmodified clients, servers) • Mirage is a router not an application • Rewrite packets on-the-fly & forward • Benefits: stateless, low overhead

NFS Basics • NFS uses “handles” to identify objects • Lookup (par_handle, “name”)  chi_handle • Read (chi_handle)  data • Handles are opaque to clients • Handles are 32-bytes (NFSv2) or 64-bytes (NFSv3)

Aggregation Issues • Make 2+ servers look like 1 big server • Key problem • 2 servers may generate the same handle for different objects • Client will be confused • Solution • Virtual handles

Virtual and Physical Handles • Virtual handles exist between clients and Mirage • Physical handles exist between Mirage and servers

VFH Contents • Mirage decides what to put in VFH • VFH is composed of • PIN (Physical Inode Number) • PFS (Physical File System Number) • SID (Server ID) • VIN (Virtual Inode Number) • HVC (Handle Verification Checksum) • MCH (Mount Checksum) • (PIN, PFS, SID) uniquely identifies a file

Data Structures • Transaction Table (TT) • Entry created during request • Entry deleted during reply • Remembers NFS Proc Number, Client ID • Handle Table (HT) • VFH  PFH mappings • Tables are Soft State

Request / Reply Processing • On requests, • Lookup VFH in HT  yields PFH • Rewrite VFH in request with PFH • Forward to server (SID tells which one) • On replies, • Lookup PFH in HT  yields VFH • Create new mapping if necessary • Rewrite PFH in reply with VFH • Forward to client

Router Failure / Recovery • If router fails, TT and HT are lost • Clients will retry any ops in progress • TT state regenerated automatically • Recover HT state from fields in VFH • Extract (PIN, PFS, SID) • Search servers for (PIN, PFS, SID) to get PFH • Similar to BASE • Periodically checkpoint HT to servers

Prototypes • User Mode Process • Linux operating system / commodity HW • Proof of concept • Demonstrates aggregation & replication • UDP Sockets • IXP2400 Network Processor • High performance • Possible production system • Subject of ongoing/future work

IXP2400 Overview • 1 StrongArm CPU (general purpose, Linux OS) • 8 Microengine CPUs (packet processing)

Microengine CPU Properties • Lots of registers • 256 GPR, 128 NN, 512 memory-i/o • Special packet-processing instruction set • Multithreading support • 8 threads per microengine • Zero context-switch overhead • Asynchronous memory I/O • Fast-path processing

Memory • DRAM: 64 MB / 300 cycles • Direct IO to and from network interface • SRAM: 8 MB / 150 cycles • Support atomic operations • Built-in “queues” w/ atomic dequeue, get, put • Scratchpad:16 KB / 60 cycles • Supports atomic operations • Built-in “rings” with atomic get/put • Local per-microengine: 2560 B / 3 cycles

IXP Issues • Divide Mirage functionality across Microengines • Control interface between StrongArm and Microengines • Optimize Microengine code

Benchmark Configuration • Two IXP boards: Benchmark and Mirage • Attempt throughput and measure actual (note: transmit, receive, classifier microengines not shown)

Loopback Configuration • Simulates a router without Mirage

IXP Performance

Analysis • User-mode Mirage • 40,000 packets/second • Read/Write bandwidth at 320 Mbps • IXP Mirage • 290,000 packets/second • Read/write bandwidth exceeds gigabit line speed (In theory, approx 2.4 Gbps)

Status • Completed • User-mode Mutation (Gecko), Aggregation, Replication, Fortification • IXP Aggregation • To-do • IXP performance tuning • Finish IXP benchmarks • IXP Replication ? • IXP Gecko ? • SOSP Paper

Publications • Scott Baker, John Hartman, “The Gecko NFS Web Proxy,” Proceedings of the Eighth International Conference on World Wide Web. 1999. • Scott Baker, Bongki Moon, “Distributed Cooperative Web Servers,” Proceedings of the Eighth International Conference on World Wide Web. 1999. • Scott Baker, John Hartman, “The design and implementation of the Gecko NFS Web Proxy,” Software Practice and Experience, June 2001. • Scott Baker, John Hartman, and Ian Murdock, "Swarm: Agent-Based Storage," The 2004 International Conference on Software Engineering Research and Practice. Las Vegas, Nevada. June, 2004. • Scott Baker and John Hartman, "The Mirage NFS Router," The 29th IEEE Conference on Local Area Networks. Tampa, FL. November, 2004.

High Performance Storage Service Virtualization