420 likes | 430 Views
Differential Quality of Service for Online Storage. Chuck Silvers VxFS Engineer Extraordinaire. Paul Massiglia Technical Director without Portfolio Session 192. Mirror (3x). Mirror (2x). RAID. Concatenated, Striped, Simple. SATA JBOD. Fibre Channel JBOD. SATA array. Mid-range array.
E N D
Differential Quality of Servicefor Online Storage Chuck SilversVxFS Engineer Extraordinaire Paul MassigliaTechnical Directorwithout Portfolio Session 192
Mirror (3x) Mirror (2x) RAID Concatenated, Striped, Simple SATA JBOD Fibre Channel JBOD SATA array Mid-range array Enterprise array Premises • There is an online storage cost hierarchy
High performanceis critical Performance hasmeasurable impact Performancesecondary to cost Occasionalaccess Reproducible Important (individual) Important (project) Important (department) Enterprise-critical Premises • There is an online data importance hierarchy
Conclusions • “Better” storage costs more • “More important” files have higher value Duuuh . . .Shouldn’t we be trying to match the two up?
What’s The Big Deal? As files’ importance changes, you move them to the “right” type of storage Ten million of them??? If you could analyze and move ten files a second, it would take 11.5 days to process 10,000,000 files…and then you could start over again
It’s Even Worse Than That… • One storage device = one file system • If you move files from one file system to another: either • Files have to be restored to their original locations to be used…e.g., via HSM or • Backup procedures have to be adjusted • Applications have to be redirected • Run scripts have to be modified • Either way… • Administrative cost • Resources • Potential for errors and downtime
25% premium storage @ $25/GB75% low-cost storage @10/GB $275,000 But Still, Tiered Online Storage Is Pretty Compelling… 20 Terabytes 100% premium storage @ $25/GB $500,000
. . . Until You Consider Administrative Cost 20 Terabytes 100% premium storage @ $25/GB 25% premium storage @ $25/GB75% low-cost storage @10/GB $500,000 $275,000 How many administrators does it take to consume $225,000 in hardware savings?
The Technical Problem • Basic property of conventional file systems: • 1 file system = 1 virtual device • In other words • one “namespace” = one quality of storage service…for all files ever stored in the file system Conventional file systems are designed for homogeneous storage
/sales /financial /development /current /forecast /2005 /new_app /2004 /history Simple (JBOD) 2x mirrored Wide striped Striped The Solution: VxFS Multi-Volume File System Distributes a single namespace across multiple volumes /one_big_fs 3x mirrored
More Than Support for Volume Expansion… • A file system that only supports volume expansion • One homogeneous logical block space • Expands when its volume grows • One class of service • A multi-volume file system • Multiple block spaces (volumes) of different types • Places files in specific block spaces • More volumes = more service classes
Multi-Volume File System Value Proposition 1:Reduce Average Online Storage Cost • Suppose 80% of data is seldom-accessed • 50% savings with SATA RAID-5 for seldom-accessed data
Temporaryfile Importantfile Criticalfile Aged file A/Vfile Simple (JBOD) 2x mirrored Wide striped Striped What Does a Multi-Volume File SystemHave to Do ? • Allocate new files on the “right” type of storage • Automatically • Based on “set-and-forget” policies 3x mirrored
How do people define allocation policies for VxFS multi-volume file systems, Chuck ?
New Concept #1: Volume Sets • A means of grouping volumes • Up to 256 volumes of any type • All volumes made from disks in a single disk group • Individual volumes don’t appear in “/dev/vx/…” • Example # vxvset –g homedg make HomedirSet MirrorVol # vxvset –g homedg addvol HomedirSet RAIDVol1 # vxvset –g homedg list HomedirSet VOLUME INDEX LENGTH STATE CONTEXT MirrorVol 0 10240 ACTIVE - RAID5Vol1 1 10240000 ACTIVE - • Multi-volume file systems are created on volume sets # mkfs –F vxfs /dev/vx/rdsk/homedg/HomedirSet # mount –F vxfs /dev/vx/dsk/homedg/HomedirSet /home
New Concept #2:Allocation Policies • Rules to control allocation of storage capacity • A rule consists of: • A name • A list of volumes • An order in which to attempt allocation • Some control flags • Example # fsapadm define –o as-given /home DataPolicy RAIDVol1 RAIDVol2
New Concept #2:Allocation Policies • Allocation policies are assigned to objects • File system • Storage checkpoints • Individual files • A VxFS object may have • A data policy • A metadata policy • Example # fsapadm assignfs /home DataPolicy MetadataPolicy
RAIDVol1 MirrorVol DataPolicy MetadataPolicy Using Allocation Policies …to separate user data from metadata # fsapadm define /home DataPolicy RAIDVol1 # fsapadm define /home MetadataPolicy MirrorVol # fsapadm assignfs /home DataPolicy MetadataPolicy /home
MirrorVol MetadataPolicy Using Allocation Policies …to define volume fill order # vxvset addvol HomedirSet RAIDVol2 # fsvoladm add /home DataVol2 500m # fsapadm define –o as-given /home DataPolicy RAIDVol1 RAIDVol2─or─ # fsapadm define –o round-robin /home DataPolicy RAIDVol1 RAIDVol2─or─ # fsapadm define –o least-full /home DataPolicy RAIDVol1 RAIDVol2 /home RAIDVol2 RAIDVol1 DataPolicy
MirrorVol LogVol MetadataPolicy Log Using Allocation Policies …to separate user data from the log # vxvset addvol HomedirSet LogVol # fsvoladm add /home LogVol 50m # fsadm –o logdev=LogVol,logsize=48m /home /home RAIDVol2 RAIDVol1 DataPolicy
MirrorVol LogVol MetadataPolicy Log Using Allocation Policies …to direct files to specific volume types # fsapadm define /home StreamPolicy RAIDVol1 # fsapadm define /home TxnPolicy RAIDVol2 # fsapadm define /home MetadataPolicy MirrorVol # fsapadm assignfile -f inherit /home/jpg/ StreamPolicy # fsapadm assignfile -f inherit /home/db/ TxnPolicy /home RAIDVol2 RAIDVol1 StreamPolicy TxnPolicy
MirrorVol LogVol MetadataPolicy Log CkptVol CkptPolicy Using Allocation Policies …to control storage checkpoint locations # fsapadm define /home CkptPolicy CkptVol # fsapadm define /home MetadataPolicy MirrorVol # fsapadm define –o least-full /home DataPolicy RAIDVol1 RAIDVol2 # fsapadm assignfs /home DataPolicy MetadataPolicy # fsapadm assignckpt /home Monday_7PM CkptPolicy CkptPolicy /home RAIDVol2 RAIDVol1 DataPolicy
MirrorVol LogVol MetadataPolicy Log Multi-Volume File System Value Proposition 2:Improve I/O Performance …by moving seldom-used data “out of the way” • Compact often-accessed small files for shorter average seeks • Place seldom-used files on “commodity” storage /home CheapVol EnterpriseVol ActivePolicy InactivePolicy
MirrorVol LogVol MetadataPolicy Log XfrRateVol BulkVol TblspacePolicy BLOBPolicy Multi-Volume File System Value Proposition 2:Improve I/O Performance …by segregating different types of user I/O • e.g., database indexes vs. data tables vs. BLOBs • e.g., A/V streams vs. electronic mail • e.g.,… /home IORateVol IndexPolicy
MirrorVol LogVol MetadataPolicy Log XfrRateVol IORateVol LargeRecPolicy SmallRecPolicy Multi-Volume File System Value Proposition 2:Improve I/O Performance …by exploiting different hardware capabilitiese.g., • Database index tablespaces: • Solid-state disk • Database data tablespaces: • Striped volume with small stride for large records • Striped volume with larger stride for small records /home SuperFastVol IndexPolicy
Multi-Volume File System Value Proposition 3:Make More Frequent Checkpoints Affordable …by placing them on inexpensive storage Less costly storage→more frequent checkpoints→better RPOs
FileRelocation daemon policies is dormantfor 30 days goes intoproduction grows to>20MB Simple (JBOD) 2x mirrored Wide striped Striped What Else Does a Multi-Volume File SystemHave to Do ? • Move files to the new “right” type of storage when conditions change • Automatically based on policies 3x mirrored
How do VxFS users define and execute file relocation policies, Chuck ?
New Concept #3: File Relocation Policies • Rules by which administrators specify criteria for relocating files between volumes • Based on allocation policies • Relocation criteria • Access time • Modification time • File size • Naming pattern …and combinations of these
Relocation Policy Implementation: File Relocation Utilities • fssweep • Selects files eligible for relocation based on policies • fsmove • Relocates files selected by fssweep • fsrpadm • Defines file relocation policies
Applying File Relocation Policies • Execute fssweep and fsmove • Typical………. cron job • Exceptional…. ad hoc “back-end” commands
Relocate files not accessed in 30 days from other volumes to to low-cost storage High performance & frequent backup no longer required MirrorVol MetadataPolicy MainDataVol2 LowCostVol LrgRcrdPolicy InactivePolicy Using File Relocation:Inactive Data fsrpadm addpolicy /home \relocpol1 RelocateInactiveData \from=MainDataVol1,MainDataVol2 \to=LowCostVol \accage=30 \ modage=30 \pattern=* recursive /home MainDataVol1 SmlRcrdPolicy
Relocate files larger than 10MB but smaller than 1GB to low-cost bulk storage Reduce the cost of keeping large files online MirrorVol MetadataPolicy MainDataVol1 LowCostVol ActivePolicy BigFilePolicy Using File Relocation:Files of Certain Sizes fsrpadm addpolicy /home \relocpol2 RelocateLargeFiles \from=MainDataVol1 \to=LowCostVol \min=10000 \ max=1000000 \pattern=* \ recursive /home
Relocate files in /production directory to an enterprise array High availability and performance Files that have been renamed into /production are relocated when this policy is applied (typically ad hoc) RelocateProduction MainDataVol MirrorVol MetadataPolicy MainDataVol1 TestVol PrductionPolicy TestDataPolicy Using File Relocation:Change in Logical Location fsrpadm addpolicy /home \ dir=/home/production \relocpol2 RelocateProduction \from=TestVol \to=MainDataVol1 \pattern=* \ recursive /home
What other value propositions do multi-volume file systems offer, Paul ?
Multi-Volume File System Value Proposition 4:Simpler Online Data Administration • Fewer objects to manage • File systems • Administrative jobs • Operational procedures • Less inter-application cross-talk • One application’s I/O doesn’t interfere with others • De-coupling of data needs from storage characteristics • File systems larger than the largest available volume • Single namespace to administer
Multi-Volume File System Value Proposition 5:Simpler Encapsulation • Adding raw volumes that contain data to a file system • Most important application: converting databases from raw storage to files • Advantages • No need for separate file systems • Database I/O has its own physical resources • Administrative procedures apply to all data
The Bottom Line:Tiered Storage Is Used Effectively • Online data “goes where it belongs” …transparently to applications and utilities • Data relocation problems disappear • No time & resource consumption for “un-migration” • No loss of file attributes • No change in backup procedures • No application changes • No run script changes Today, most commercial file systems lag the capabilities of the storage they manage
DataTypes StorageTypes Simple (JBOD) 2x mirrored Wide striped Striped Multi-Volume VxFS Exploits Tiered Storage • Volume-aware • Policy-based automatic data allocation and relocation • Complete application transparency /VERITAS_FS (one namespace) Temp Sales Finance Bulk Streams 3x mirrored
The savings of tiered storage without the increased management cost