1 / 30

Grid provisioning from cloned golden boot images

Grid provisioning from cloned golden boot images. Alan G. Yoder, Ph.D. Network Appliance Inc. Outline. Types of grids Storage provisioning in various grid types Case study performance stability. Grid types. Cycle scavenging Clusters Data center grids. Cycle scavenging grids.

agnes
Download Presentation

Grid provisioning from cloned golden boot images

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

  2. Outline • Types of grids • Storage provisioning in various grid types • Case study • performance • stability 2

  3. Grid types • Cycle scavenging • Clusters • Data center grids 3

  4. Cycle scavenging grids • Widely distributed as a rule • campus or department wide • global grids • Typically for • collaborative science • resource scavenging • Main focus to date of GGF, OGF, Globus, et al. • Category includes "grid of grids" 4

  5. Clusters • Grid-like systems • good scaleout • cluster-wide namespace • Especially attractive in HPC settings • Many concepts in common with cycle-scavenging systems • but proprietary infrastructure • no management standards yet 5

  6. Data Center Grids • Focus of this talk • Typically fairly homogenous • standard compute node hardware • two or three OS possibilities • Two variants • Nodes have disks • Topologically homomorphic to cycle scavenging grids • May use cycle scavenging grid technology • Nodes are diskless • Storage becomes much more important  storage grids 6

  7. Storage technology adoption curves Enterprise Storage Market Today Grid Frameworks Today ? Global StorageNetwork StorageGrids NetworkedStorage Direct attachedStorage Focus of this talk Market Adoption Cycles 7

  8. Diskless compute farms • Connected to storage grids • Boot over iSCSI or FCP • OS is provisioned in a boot LUN on a storage array • Applications can be provisioned as well • Key benefit – nodes can be repurposed at any time from a different boot image • Key benefit – smart storage and provisioning technology can use LUN cloning to deliver storage efficiencies through block sharing • Key benefit – no rotating rust in compute nodes • reduced power and cooling requirements • no OS/applications to provision on new nodes 8

  9. SAN Local fabric technologies • Servers boot over iSCSI or FCP SAN • Storage server(s) maintain golden image + clones products = e.g. shadowimage, flexclone iSCSIorFC blah blah blah blah blah blah blah blah blah blah 9

  10. iSAN iSAN iSAN iSAN WAN Global deployment technologies Long-haul replication from central data center to local centers products e.g. snapmirror,trucopy 10

  11. Diskless booting • Node shuts down • Storage maps desired image to LUN 0 for the zone (FCP) or initiator group (iSCSI) the node is in • Node restarts • Node boots from LUN 0 • mounts scratch storage space if also provided • starts up grid-enabled application • Node proceeds to compute until done or repurposed LU – Logical UnitLUN – Logical Unit Number Mapping – LUNs :: initiator portsMasking – Initiators :: LUNs (“views”) 11

  12. Example gridsrv1 gridsrv2 gridsrv3 compute grid /vol/vol1/mysql_on_linuxLUN 0mapped to gridsrv2 /vol/vol1/geotherm2LUN 0mapped to gridsrv1 /vol/vol1/mysql_on_linuxLUN 0mapped to gridsrv3 storage grid 12

  13. What makes this magic happen? gridsrv1 gridsrv2 gridsrv3 compute grid /vol/vol1/mysql_on_linuxLUN 0mapped to gridsrv2 /vol/vol1/geotherm2LUN 0mapped to gridsrv1 /vol/vol1/mysql_on_linuxLUN 0mapped to gridsrv3 SGME storage grid 13

  14. SGME • Storage Grid Management Entity • Component of overall GME in OGF Reference model • GME is the collection of software that assembles the components of a grid into a grid • Provisioning, monitoring etc. • Many GME products: Condor et al • Current storage grid incarnations are often home-rolled scripts • Also Stork, Lustre, qlusters 14

  15. Provisioning a diskless node • Add HBAs to white box if necessary • Fiddle with CMOS to boot from SAN • For iSCSI: • DHCP supplies address, node name • SGME provisions igroup for node address • SGME creates LU for node • SGME maps LU to igroup • For FC: • zone, mask, map, etc. SGME Grid Storage Management software HBA Host Bus Adapter CMOS BIOS settings DHCP IP boot management 15

  16. Provisioning a diskless node • Add HBAs to white box if necessary • We used QLogic 4052 adapters • Fiddle with CMOS to boot from SAN • Get your white box vendor to do this • Blade server racks generally easily configurable for this as well 16

  17. Preparing a gold image • On a client – this is manual one-time work • Install Windows server (e.g.) • Setup HBA • e.g. QLogic needs iscli.exeand commands in startup.batC:\iscli.exe –n 0 KeepAliveTO 180 IP_ARP_Redirect on • Software initiators must be prevented from paging out • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management:DisablePagingExecutive => 1 • Run Microsoft sysprep setup mgr and seal image 17

  18. Preparing a gold lun • On storage server – manual one time work • Copy the golden image to a new base LUN (over CIFS) des-3050-2> lun show    /vol/vol1/gold/win2k3_hs  10g ... • Create a snap shot of the volume with the gold lun…. des-3050-2> snap create vol1 windows_lun • Create an igroup for each initiator des-3050-2> igroup create -i -t windows kc65b1 \ iqn.2000-04.com.qlogic:qmc4052.zj1ksw5c9072.1 Note: commands in blue type are Netapp-specific, for purposes of illustration only 18

  19. Preparing cloned LUNs • SGME: for each client • create a qtree des-3050-2> qtree create /vol/vol1/iscsi/ • Create a lun clone from the gold lun des-3050-2> lun create –b \ /vol/vol1/.snapshot/windows_lun/gold/win2k3_hs \ /vol/vol1/iscsi/kc65b1 • Map the lun to the igroup. des-3050-2> lun map /vol/vol1/iscsi/kc65b1 kc65b1 0 19

  20. Getting clients to switch horses • SGME: for each client • Notify client to clean up • Bring down client • remote power strips/blade controllers • Remap client LUN on storage des-3050-2> lun offline /vol/vol1/iscsi/kc65b1 des-3050-2> lun unmap /vol/vol1/iscsi/kc65b1 kc65b1 0 des-3050-2> lun online /vol/vol1/iscsi2/kc65b1 des-3050-2> lun map /vol/vol1/iscsi/kc65b1 kc65b1 0 • Bring up client • DHCP 20

  21. Lab results • Experiment conducted at Network Appliance • FAS 3050 clustered system • 224 clients (112 per cluster node) • dual core 3.2GHz/2GB Intel Xeon IBM H20 Blades • Qlogic QMC 4052 adapters • Windows Server 2003 SE SP1 • Objectives • determine robustness and performance characteristics of configuration, under conditions of storage failover and giveback • determine viability of keeping paging file on central storage 21

  22. Network configuration Not your daddy’s network 22

  23. Client load • Program to generate heavy CPU and paging activity (2 GB memory area, lots of reads and writes) • Several instances per client 23

  24. Client load, cont. • ~400 pages/sec 24

  25. Load on storage Near 100% disk utilization on storage systemin takeover mode des-3050-1(takeover)> sysstat -u 1 CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 18% 1318 3129 4413 167328 48 0 0 13 98% 0% - 100% 42% 2708 67637 6165 166210 8 0 0 13 99% 0% - 100% 53% 2035 71519 5258 155134 52419 0 0 13 99% 45% D 100% 54% 1852 62163 4488 124647 99591 0 0 13 99% 100% : 79% 49% 2021 70115 5083 123828 58347 0 0 13 99% 100% D 73% 83% 1005 24380 2414 110473 54491 0 0 13 99% 100% : 83% 42% 2892 65357 7878 211645 56495 0 0 13 99% 100% : 128% 39% 2250 29027 7839 155554 19597 0 0 13 99% 35% D 93% 74% 1671 39249 4393 184457 57014 0 0 15 100% 100% : 112% 38% 2323 57148 6777 161911 69163 0 0 15 99% 100% : 100% 51% 2105 52591 5354 147766 95826 0 0 12 99% 90% D 86% 29% 382 957 988 163609 60946 0 0 12 98% 100% : 100% 19% 1331 2232 4305 163301 6938 0 0 12 98% 49% : 100% 18% 1247 1547 4390 164802 24 0 0 13 98% 0% - 100% 30% 2037 31462 5717 167336 0 0 0 13 99% 0% - 100% 33% 2000 4047 5909 169060 24 0 0 13 98% 0% - 100% 67% 1580 2177 5471 167101 32 0 0 13 99% 0% - 100% 25

  26. Observations • Failover and giveback transparent • No BSOD when times within windows • recall: KeepAliveTO = 180 • some tuning opportunities here • actual failover was < 60 seconds • iscsi stop+start used to increase “failover time” for testing • Slower client access during takeover • expected behavior • Heavy paging activity not an issue • Higher number of clients / storage server an option, depending on application behavior 26

  27. Economic analysis $$ • Assume • 256 clients / storage server • 20w / drive • $80 / client-side drive • 80G client-side drive, 10G used per application • $3000 / server-side drive • 300G server-side drive • Calculate • server-side actual usage • cost of client-side drives vs. cost for server space • cost of power+cooling for client-side drives and server space 27

  28. Results • Server side usage • 512 clients x 10GB per application = 5 TB • Assume • 50% usable space on server • 20w typical per drive • 2.3 x multiplier to account for cooling • 5000GB * 2 / 300GB/drive * 20w/drive * 2.3  1.53 KW • 10TB raw @ $10/GB  $100,000 • Workstation side drives • Same assumptions (note: power supply issue) • 512 drives * 20w/drive * 2.3  23.5 KW • 512 drives * $80/drive  $40,960 • At $0.10/KWH, cost curves cross over in three years • in some scenarios, it’s less than two years 28

  29. Conclusion • Dynamic provisioning from golden images is here • Incredibly useful technology in diskless workstation farms • Fast turnaround • Central control • Simple administration • Nearly effortless client replacement • Green! 29

  30. Questions? 30

More Related