230 likes | 367 Views
Virtual Machine Disk Images Introspection. and a bit more. Vasily Tarasov (SBU) Dean Hildebrand (IBM Almaden) Renu Tewari (IBM Almaden) Erez Zadok (SBU). File system and Storage Lab ( F S L ). Outline. How all that started The idea of introspection
E N D
Virtual Machine Disk Images Introspection and a bit more... Vasily Tarasov(SBU) Dean Hildebrand (IBM Almaden) Renu Tewari (IBM Almaden) Erez Zadok (SBU) File system and Storage Lab (FSL)
Outline • How all that started • The idea of introspection • A couple of results from a 1st prototype • Future work • Benchmarking, Filebench
Two important technologies • Virtual Machines (VMs) - Computational resources consolidation - Flexible, efficient and scalable - Hardware support - Multiple solutions: VMWare, KVM, Xen, ... - Cloud-way of delivering services • Network Attached Storage (NAS) - Storage consolidation - Scalable, manageable and efficient - NFS/CIFS available on majority of Operating Systems - NAS sales jumped from $540M in 1998 to $5.1B in 2003 - IBM SONAS
Two technologies… Dean VM NAS
…and they grow Dean VM NAS
How do VM & NAS work together? Can we make them work better? IBM VM SO NAS
Typical Setup VM 1-1 Storage 1-1 Virtual Machines Host 1 Storage 1-2 VM 1-2 VM 2-1 NFS CLIENT Virtual Machines Host 2 VM 2-2 GPFSNode 1 GPFSNode 3 Storage 3-1 Storage 3-2 VM 3-1 NFS CLIENT GPFSNode 1 GPFSNode 4 Storage 4-1 Virtual Machines Host 3 Storage 4-2 VM 3-2 VMWare, KVM, XEN, ... NFS CLIENT GPFSNode 1 Storage 2-1 NFS SERVER1 GPFSNode 2 Storage 2-2 NFS SERVER2
– CAching CA – Read-Ahead RA – Request Mangling and Scheduling RM Datapath Decomposed Applications VM Guest Virtual File System CA RA On-Disk File System RA RM CA RA RM Block Layer RM Controller Driver Host CA RA Controller Emulator RM NFS Client NETWORK RM NFS Server CA RA Virtual File System NAS RA RM On-Disk File System CA RA RM Block Layer RM Controller Driver
Collecting traces: setup • Rand/Seq Read • Rand/Seq Write • Various I/O sizes • Multi-file workloads • Multi-process workloads • Meta-data intensive NFS Server VMWare ESX4 Within VM trace 1Gbps VSCSI LayerTrace Block LayerTrace Network Trace
VM Guest Applications Virtual File System On-Disk File System Block Layer Controller Driver Host Controller Emulator NFS Client NETWORK NFS Server NAS Virtual File System On-Disk File System Block Layer Controller Driver Collecting traces: setup User-Space Workload • Rand/Seq Read • Rand/Seq Write • Various I/O sizes • Multi-file workloads • Multi-process workloads • Meta-data intensive VSCSI LayerTrace Network Trace Block LayerTrace
VM Guest Applications Virtual File System On-Disk File System Block Layer Controller Driver Host Controller Emulator NFS Client NETWORK NFS Server NAS Virtual File System On-Disk File System Block Layer Controller Driver Some interesting results 4MB • I/O sizes change 4KB 1MB WIOV’11 - Revisiting the Storage Stack in Virtualized NAS Environments 128KB 32KB 256KB
Meta-data Ops Data Ops Non-VM case VM case • Update attributes • List directories • Creation/deletion • Lookup • Access permissions • Link/Symlink operations # stat /foo/bar # stat /foo/bar sys_stat(/foo/bar) sys_stat(/foo/bar) NFS_GETATTR(foobar_fh) NFS_READ(dskimg_fh) NFS_WRITE(dskimg_fh)
Come up with an idea Disk Image File Ext, NTFS,UFS, ... What is located inthis region? READ(dskfh, offset, len) Offset Size NFS Server • READ from: • Inode • Directory entry • Data of specific file • ... Do intelligent things!
Prototype Results: Find 80% improvement find
Prototype Results: Startup 2.6x times faster 130 sec 50 sec
Future work • Solid implementation • More efficient cache policies • Optimizations on the write path • Analysis of more complex workloads
Virtual Machine Disk Images Introspection a bit more...
A Recent Study Concluded that… • Much of what researchers conclude in their studies is misleading, exaggerated,or flat-out wrong • A new claim about a research findings is more likely to be false than true • Researchers tend to publish positive results more often than negative findings • Chances to be accepted to a conference are higher if the results are “more exciting” HotOS’11: Benchmarking FS Benchmarking: It is Rocket Science 2005-2008 study by J. Ioannidis Sociology Medicine Computer Science A B C Biology D Physics E
Filebench • Originally created by SUN Microsystem(RIP ) • Maintained by FSL • Used in many papers • Flexible: Workload Model Language – WML • Portable: Linux, FreeBSD, Solaris, MacOS, Windows *
Filebench WML define fileset name=myfileset,size=16kb,entries=1000 define processname=reader,instances=1 { thread name=readerthread,memsize=10m,instances=10 { flowop read name=myread,filesetname=myfileset,iosize=2kb } }
Filebench for Cloud Services flowops: • Reads • Writes • Creates • Deletes • +20 moresophisticated POSIX NFS RPC AFS RPC Cloud
Filebench for Virtualized Environments define hypervisorname=hpv,type=esx3.1,instances=1{ define processname=reader,instances=1 { thread name=readerthread,memsize=10m,instances=10 { flowop read name=myread1,filesetname=myfileset,… } } define vmname=hpv,type=windows,instances=5{ } }
Virtual Machine Disk Images Introspection and a bit more... VasilyTarasov(SBU) Dean Hildebrand (IBM Almaden) RenuTewari(IBM Almaden) ErezZadok(SBU) Thank you! File system and Storage Lab (FSL)