410 likes | 531 Views
Lab Manager Best Practices. Breakout Session # AD 3002 Steven Kishi Director of Product Management, VMware, Inc. September, 2008 . Disclaimer. This session may contain product features that are currently under development.
E N D
Lab Manager Best Practices Breakout Session # AD 3002 Steven Kishi Director of Product Management, VMware, Inc. September, 2008
Disclaimer This session may contain product features that are currently under development. This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product. Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features discussed or presented have not been determined. “These features are representative of feature areas under development. Feature commitments are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery.”
Agenda • Architecting • Minimum Installation • Sizing and Scaling Lab Manager • Availability, Backup and Disaster Recovery • Multi-Site Implementation Options • Configuring • Using Organizations to Manage Resources • Usage Model • Automation • Rollout and Training • Maintaining • Storage Leases • Gold Master Trees • Cleanup Strategies
Minimum Installation • Lab Manager can be run on a single ESX Server • Install Virtual Center Server in a VM on that server • Use internal disk as VMFS datastore • Install Lab Manager Server in a VM on that server and set it up to manage the server • Add more servers to show scalability and distribution routines • Ensure VMotion networking requirements are in place • Cluster in VC and enable DRS and HA
ESX sizing: Same as traditional– dependent on the workload Datastore sizing: initially 10x the base disk sizes Shared storage is the first bottleneck for most. Size to 30-100 concurrently running VMs Consider the VMFS (not NFS) limit of 8-hosts per tree; sets DRS cluster limit Sizing and Scaling: Part 1 It’s all about the bottleneck… LM Server VC Server ESX ESX ESX VM
Scale in the same ratio of ESX:datastore Distribute base disk content amongst datastores Can cross-connect datastores VC is the next bottleneck, solved by multiple installations 2000 concurrently registered (not stored) VMs. VC can only handle a certain number of actions/unit time. Watch the tasks in VIC. Sizing and Scaling: Part 2 It’s all about the bottleneck… LM Server VC Server ESX ESX ESX ESX ESX ESX VM VM
Lab Manager VMs run independently of the LM Server Console feeds come straight from the ESX server LM Server is rarely a bottleneck As concurrent clients increase, so do number of ESX hosts Nice, horizontal scaling Limits 2000 VMs/installation (VC) 8 hosts/DRS cluster (DRS) 32 hosts/ shared datastore Unlimited hosts, users Pay attention to storage array maximums Sizing and Scaling: Horizontal Scaling Client Client LM Server VC Server ESX ESX ESX ESX ESX ESX VM VM
Availability • HA on the LM Server • HA on the VMs created in Lab Manager • Use multipathing and RAID on storage Use HA and DRS
Lab Manager Backup • Individual VMs can be backed up at the guest level using traditional techniques, but it will consolidate the disk • Use this in conjunction with an installation-level backup to be able to restore important VMs • Backup the installation at the Storage Array using Snapshots, if possible • Run the Lab Manager server as a VM on the same storage array as your datastores • Snapshot the datastore on a regular basis based on desired RTO (full + delta). Archive a full snapshot for DR. • Recovery will look like the plug was pulled on all running VMs and the Lab Manager server. • VMs will need to be “force undeploy”ed and redeployed • You will lose whatever guest OS data you’d lose if the plug was pulled on the VM.
Lab Manager Backup • You can backup the system using existing file-level techniques for backing up (1) the database on the LM Server and (2) its datastores • You can perform this backup hot with the following considerations: • If backup of 1 & 2 is not simultaneous (and it is hard to make it so), you will lose VMs where linked clone actions occurred between backups. • The only VMs at risk are deployed workspace configurations. Templates, Library Configurations, undeployed Workspace Configurations, and Media are not at risk. • Use the SOAP API to determine deployed Workspace configurations and back them up when there are no linked clone actions. Everything else can backup over a longer period of time with no risk of data loss.
Multi-Site Implementation • Many Options: • Multiple installations • Remote users VPN into installation • Multiple resource pools attached to one LM server
Remote Usage: VPN Connectivity to Installation LM Client Full functionalityWAN latency effects on console Console VPN LM Server VC Server ESX ESX ESX VM Site 1 Site 2
Remote Usage: Local and Remote Resource Pools LM user can login to remote LM Server and deploy resources either locally or remotely No WAN latency effects on console if VMs are on resources local to client LM Client Console Must be routable with no NAT between networks. LM Server VC Server ESX ESX ESX ESX ESX ESX VM Templates and configurations will deploy onto ESX servers associated with the datastores they are on. Site 1 Site 2
Organizations and Resources • Organizations allow resources to be divided up in an installation. • Bunkering users from each other • Separating resources for security, chargeback, or maintenance • Control scalability of organization • Divide up different types of workloads • Load Profile • Production vs. Non-Production
Typical Use Cases • IT • Library of production apps for patch testing • Library of standard desktops and servers for compatibility testing • Machines for onboarding new employees • Application installs for transient usage • Development of the next ERP system, etc. • Training • Deploying labs between classes • SEs • Giving demos • Saving customized demos for later reuse
Typical Use Cases • Test/Dev • Library of reference platforms and application installs (templates or configurations) • Library of customer configurations • LiveLinks to communicate tests or bug state • Extra development environments for past and current releases • Automation: • Build system- smoke test- library generation- LiveLink errors and finished state • Test case automation • One-click test matrix testing
Lab Manager Automation in Test/Dev: Smoke Tests • Daily build produced by build System • Lab Manager SOAP API used to checkout reference platforms, install build, and download and run latest smoke test scripts • Bugs are sent to dev team and filed in bug tracking system using LiveLinks • All results are checked into the Lab Manager Library. Leases used to keep library clean. • Fixes checked into build system for next day’s build Build System SQA System Tests Lab Manager QA/ Dev Fixes
User Training and Rollout • Tips on training your userbase • Videos on Overview page are geared towards getting users up-to-speed • Some set up a wiki for communication • Some force users to view training materials before allowing them to login to the system • VMware Education has a user training course starting Q4’08 • Rollout • Initially rollout to subset of userbase to understand system configuration and resource needs • VMware PSO can help with custom SOW
Standard Maintenance • COW disk chain length limit is 30 • Consolidate disk before it hits limit onto same or different datastore. Consider deleting original tree of disks when appropriate • Don’t worry about consolidating early– it just creates more work for little benefit • Understand and use Gold Masters for maintainability • Lab Manager keeps chains of disks on the same datastore. Use copy and delete commands to move nodes between datastores as necessary. Gold Master “Tree”
Controlling Disk Chain Length • Do this to minimize maintenance, not for performance • Instruct users on preferred usage patterns • Clone from the Library, not the Workspace • Don’t use snapshots excessively • Go back to original Library entry or Gold Master where possible • Set a policy for cleanup of disk files • Use Storage Leases • Use Gold Masters and control who has rights to set them • Have users identify which clones can be deleted, when.. • Naming strategies • Datastore strategies • Combination of both
Storage Leases and Gold Masters • Storage Leases are very powerful • Users can extend leases • Automatically prunes trees in a socially-acceptable way • Gold Masters control tree depth • By having users clone gold masters, tree depth is controlled. Everybody clones from minimum length version; only gold masters need to be consolidated
Datastore-based Cleanup Strategy • Organize data by datastore based on retention policies. For example: • Gold Master datastore collects gold versions of the supported platforms • User datastore is organized by software version, and deleted when version is shipped or EOL • Organizations can also be used in conjunction with this strategy to manage user access to the datastore
Naming-Based Cleanup Strategy • Have users name configurations so that things that can be deleted can be identified • Delete after hotfix is released • Delete after version is released • Delete after version is EOL • Set up policies for naming. Recommendations are to keep names: • Short • Uniquely identifiable • Easily distinguishable • Sortable and Filterable in LM
For more information: Lab Manager 3 Training • Module 1: Lab Manager Overview • Module 2: Lab Manager 3 New Features • Module 3: Lab Manager Under the Covers • Module 4: Architecting a Lab Manager Installation • Module 5: Lab Manager Use Cases • Module 6: Installing Lab Manager • Module 7: Lab Manager Settings • Module 8: Managing Lab Manager • Module 9: Lab Manager Maintenance • Module 10: Troubleshooting Lab Manager
Limited Time Offer for VMworld Attendees • Get 38% off when you purchase the VMware Management and Automation Bundle by Dec 15, 2008, including: • VMware Lifecycle ManagerSystematic management of virtual machines across the datacenter • VMware Lab ManagerAutomated application development and test lab infrastructure • VMware Stage ManagerAutomated change, control and release management of business applications • VMware Site Recovery ManagerRapid, reliable and automated disaster recovery management Contact a VMware representative or channelpartner for more details
Q&A Breakout Session # AD 3002 Steven Kishi Director of Product Management, VMware, Inc. September, 2008
COW Disks • Lab Manager creates copies of disks as COW disks unless explicitly directed otherwise • Saves time and storage resources • Performance is generally not an issue • Maintenance requirements should be considered • “Copy On Write” (COW) Disks • Initial COW disk is 16MB • A write copies block to sparse disk, then writes data • Reads made through COW data structure mapping blocks to files in VM overhead memory • Only leaf node is R/W. All internal nodes are read-only. • Lab Manager keeps all COW disks in a tree on the same datastore
Performance of COW Disks • COW disks can perform better or worse than monolithic disk depending on the usage profile. Performance is not strongly affected by chain length. • Read performance hit • Startup delayed as COW data structure read into overhead memory • Cache miss caused by diverse reads will result in multiple physical reads per virtual read • Write performance hits • Copy process causes extra physical read and write when writing to new blocks • Extending sparse file causes SCSI lock on VMFS • Read Performance gain (can be big) • Storage Arrays cache small read-only COW disks in memory so reads happen from RAM instead of spindle
How Lab Manager uses COW Disk: Cloning 4 2 3 2 4 2 1 0 3 2 1 0 2 2 Clone Library Config 4 times Clone Workspace Config 4 times Chain length grows much faster when cloning from Workspace because parent can still change, requiring a “double-headed” clone. Take clones from the Library to control chain length and reduce maintenance.
How Lab Manager uses COW Disks: Snapshots 2 3 2 2 1 0 2 1 0 1 1 0 0 Start Snapshot Snapshot Revert To be garbage collected Chain length grows for each snapshot. When reverting chain length does not decrease. Use snapshots sparingly and capture to Library instead to reduce maintenance.
Network Fencing: a separate IP space • One elegant use of the fence is to contain a separate IP space that can represent a production network or a customer’s network– with all machines in the configuration working together and still having connectivity to the lab network This is a separate IP space from the lab network Set GW/NM/DNS on the configuration, and the IP addresses on the network using a virtual network IP pool or manual IP addressing. Fenced Config VM VM VM VR Lab Network
Lab Manager and Domains • Basic Rule: Domains don’t cross fences. Options: • Option 1: Put the DC in the fence. You can then deploy as many fenced copies as you wish– each will be a small copy of the same domain. Extreme scalability for groups that develop or test domain applications. • Option 2: Use a central DC and deploy unfenced. Use Lab Manager to save resources, have a central library, and provide user self-help. • Next level of detail: You can authenticate once over a fence, and some things work and some don’t based on protocol. Multiple copies of the same machine will not work, however.
Lab Manager and External Resources • NATtable Resources (like DBs): • You can access the resource from fenced machines. Whether this works well is dependent on the target app. • LM opens the option of putting resources you would normally not think of copying into the fence • Non-NATable Resources (like those that embed source IPs in the packets– DCOM): • Proxy the communication into the fence • Consider putting the resource into the fence
Saving IP Addresses • Lab Manager uses IP addresses in many ways: • External IP Addresses • Virtual Routers take 2 IP addresses • Internal IP addresses for machines that use the IP pool • Minimize IP address usage using the following technique: • Use virtual networks inside configurations Everything can use the same IP addresses over and over • Deploy connecting virtual-to-physical if connectivity between lab network and machines is desired • Use DHCP for unfenced machines • Static IP pool is used for fenced machines at this time; but only when they are deployed. IP addresses are reused when undeployed
Test Matrices of Static IP Machines • Use Case: Some accounts already have a library of static VMs they want to combine in different combinations • Solution: Use Lab Manager’s selective deployment feature. Configuration combine/split can also help. • Put all static VMs intended to work together in a single Library configuration • Use selective deployment to deployjust the ones you want in a newWorkspace configuration • Use combine/ split to incrementallyadd or save these VMs
AD 3002: Lab Manager Best Practices • Abstract: You know what Lab Manager is and what it does. Now you want to understand best practices when it comes to architecting, maintaining, automating, and rolling out Lab Manager. If you are interested in these topics, this session is for you. • Level: Intermediate • Type: Breakout • Length: 60 minutes • Speaker: Steven KishiDirector of Product Management VMware, Inc.steven@vmware.com • Steve Kishi is product manager of VMware Lab Manager. Steve joined VMware through acquisition of Akimbi Systems where he was VP of product management, services, and support. Previously, Steve was a principal at the early-stage software venture capital firm Hummer Winblad Venture Partners, built the marketing and services teams at web-based enterprise software company Extensity, was VP engineering at a technology consulting firm, and a design engineer at McDonnell Douglas. Steve received a BS in aerospace engineering from MIT and an MBA from the University of California, Berkeley.