1 / 54

Clustering Technology In Windows NT Server, Enterprise Edition Jim Gray Microsoft Research Gray@Microsoft research.Micro

Clustering Technology In Windows NT Server, Enterprise Edition Jim Gray Microsoft Research Gray@Microsoft.com research.Microsoft.com/~gray. Today’s Agenda. Windows NT ® clustering MSCS (Microsoft Cluster Server) Demo MSCS background Design goals Terminology Architectural details

teness
Download Presentation

Clustering Technology In Windows NT Server, Enterprise Edition Jim Gray Microsoft Research Gray@Microsoft research.Micro

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering Technology In Windows NT Server, Enterprise EditionJim GrayMicrosoft ResearchGray@Microsoft.comresearch.Microsoft.com/~gray

  2. Today’s Agenda • Windows NT® clustering • MSCS (Microsoft Cluster Server) Demo • MSCS background • Design goals • Terminology • Architectural details • Setting up a MSCS cluster • Hardware considerations • Cluster application issues • Q&A

  3. Extra Credit • Included in your presentation materials but not covered in this session • Reference materials • SCSI primer • Speakers notes included • Hardware Certification

  4. MSCS In Action

  5. High Availability Versus Fault Tolerance • High Availability: mask outages through service restoration • Fault-Tolerance: mask local faults • RAID disks • Uninterruptible Power Supplies • Cluster Failover • Disaster Tolerance: masks site failures • Protects against fire, flood, sabotage,.. • Redundant system and service at remote site

  6. Windows NT ClustersWhat is clustering to Microsoft? • Group of independent systems that appear as a single system • Managed as a single system • Common namespace • Services are “cluster-wide” • Ability to tolerate component failures • Components can be added transparently to users • Existing client connectivity is not effected by clustered applications

  7. Microsoft Cluster Server • 2-node available 97Q3 • Commoditize fault-tolerance (high availability) • Commodity hardware (no special hardware) • Easy to set up and manage • Lots of applications work out of the box. • Multi-node Scalability in NT5 timeframe

  8. MSCA Initial Goals • Manageability • Manage nodes as a single system • Perform server maintenance without affecting users • Mask faults, so repair is non-disruptive • Availability • Restart failed applications and servers • Un-availability ~ MTTR / MTBF , so quick repair • Detect/warn administrators of failures • Reliability • Accommodate hardware and software failures • Redundant system without mandating a dedicated “stand by” solution

  9. MSCS Cluster Client PCs Server A Server B Heartbeat Cluster management Disk cabinet A Disk cabinet B

  10. Browser Server 1 Server 2 Failover Example Server 1 Server 2 Web site Web site Database Database Web site files Database files

  11. Basic MSCS Terms • Resource - basic unit of failover • Group - collection of resources • Node - Windows NT® Server running cluster software • Cluster - one or more closely-coupled nodes, managed as a single entity

  12. MSCS NamespaceCluster view Cluster name Node name Node name Virtual server name Virtual server name Virtual server name Virtual server name

  13. MSCS NamespaceOutside world view Cluster Node 1 Node 2 Virtual server 1 Virtual server 2 Virtual server 3 Internet Information Server SQL MTS “Falcon” Microsoft Exchange IP address: 1.1.1.1 Network name: WHECCLUS IP address: 1.1.1.2 Network name: WHECNode1 IP address: 1.1.1.3 Network name: WHECNode2 IP address: 1.1.1.4 Network name: WHEC-VS1 IP address: 1.1.1.5 Network name: WHEC-VS2 IP address: 1.1.1.6 Network name: WHEC-VS3

  14. Windows NT ClustersTarget applications • Application & Database servers • E-mail, groupware, productivity applications server • Transaction processing servers • Internet Web servers • File and print servers

  15. MSCS Design Philosophy • Shared nothing • Simplified hardware configuration • Remoteable tools • Windows NT manageability enhancements • Never take a “cluster” down: shell game rolling upgrade • Microsoft® BackOffice™ product support • Provide clustering solutions for all levels of customer requirements • Eliminate cost and complexity barriers

  16. MSCS Design Philosophy • Availability is core for all releases • Single server image for administration, client interaction • Failover provided for unmodified server applications, unmodified clients (cluster-aware server applications get richer features) • Failover for file and print are default • Scalability is phase 2 focus

  17. Non-Features Of MSCS • Not lock-step/fault-tolerant • Not able to “move” running applications • MSCS restarts applications that are failed over to other cluster members • Not able to recover shared state between client and server (i.e., file position) • All client/server transactions should be atomic • Standard client/server development rules still apply • ACID always wins

  18. Setting Up MSCS Applications

  19. Attributes Of Cluster- Aware Applications • A persistence model that supports orderly state transition • Database example • ACID transactions • Database log recovery • Client application support • IP clients only • How are retries supported? • No name service location dependencies • Custom resource DLL is a good thing

  20. MSCS Services For Application Support • Name service mapper • GetComputerName resolves to virtual server name • Registry replication • Key and underlying keys and values are replicated to the other node • Atomic • Logged to insure partitions in time are handled

  21. Application Deployment Planning • System configuration is crucial • Adequate hardware configuration • You can’t run Microsoft BackOffice on a 32-MB 75mhz Pentium • Planning of preferred group owners • Good understanding of single-server performance is critical • See Windows NT Resource Kit performance planning section • Understand working set size • What is acceptable performance to the business units?

  22. Evolution Of Cluster- Aware Applications • Active/passive - general out-of- the-box applications • Active/active - applications that can run simultaneously on multiple nodes • Highly scalable - extending the active/active through I/O shipping, process groups, and other techniques

  23. Application Evolution Application Node 1 Node 2 Microsoft SQL Server  Microsoft Transaction Server (MTS) Internet Information Server (IIS) Microsoft Exchange Server

  24. Evolution Of Cluster- Aware Applications Application Node 1 Node 2 Node 3 Node 4 Microsoft SQL Server     Microsoft Transaction Server (MTS) Internet Information Server (IIS) Microsoft Exchange Server

  25. ResourcesWhat are they? • Resources are basic system components such as physical disks, processes, databases, IP addresses, etc., that provide a service to clients in a client/server environment • They are online in only one place in the cluster at a time • They can fail over from one system in the cluster to another system in the cluster

  26. Resources • MSCS includes resource DLL support for: • Physical and logical disk • IP address and network name • Generic service or application • File share • Print queue • Internet Information Server virtual roots • Distributed Transaction Coordinator (DTC) • Microsoft Message Queue (MSMQ) • Supports resource dependencies • Controlled via well-defined interface • Group: offers a “virtual server”

  27. Cluster Service To Resource Windows NTcluster service Resourcemonitor Initiate changes Resource events Physical diskresource DLL IP addressresource DLL Generic appresource DLL Databaseresource DLL Disk Network App Database

  28. Cluster Abstractions Resource Cluster Resource Group • Resource: program or device managed by a cluster • e.g., file service, print service, database server • can depend on other resources (startup ordering) • can be online, offline, paused, failed • Resource Group: a collection of related resources • hosts resources; belongs to a cluster • unit of co-location; involved in naming resources • Cluster: a collection of nodes, resources, and groups • cooperation for authentication, administration, naming

  29. Resources • Resources have... • Type: what it does (file, DB, print, Web…) • An operational state (online/offline/failed) • Current and possible nodes • Containing Resource Group • Dependencies on other resources • Restart parameters (in case of resource failure) Resource Cluster Group

  30. Resource • Fails over (moves) from one machine to another • Logical disk • IP address • Server application • Database • May depend on another resource • Well-defined properties controlling its behavior

  31. Resource Dependencies • A resource may depend on other resources • A resource is brought online after any resources it depends on • A resource is taken offline before any resources it depends on • All dependent resources must fail over together

  32. Dependency Example Database resource DLL Generic application resource DLL IP address resource DLL Drive E: resource DLL Drive F: resource DLL

  33. Group Example Payroll group Database resource DLL Generic application resource DLL IP address resource DLL Drive E: resource DLL Drive F: resource DLL

  34. Resourcemonitors MSCS Architecture ClusterAPI Cluster administrator Cluster API DLL Cluster API stub Cluster.Exe Cluster API DLL Global Update Manager LogManager Database Manager MembershipManager Event Processor CheckpointManager ObjectManager Node Manager FailoverManager ResourceManager Applicationresource DLL Resource API Physicalresource DLL Logicalresource DLL Applicationresource DLL Reliable ClusterTransport + Heartbeat Network

  35. MSCS Architecture • Cluster service is comprised of the following objects • Failover Manager (FM) • Resource Manager (RM) • Node Manager (NM) • Membership Manager (MM) • Event Processor (EP) • Database Manager (DM) • Object Manager (OM) • Global Update Manager (LM) • Checkpoint Manager (CM) • More about these in the next session

  36. Setting Up An MSCS Cluster

  37. MSCS Key Components • Two servers • Multi versus uniprocessor • Heterogeneous servers • Shared SCSI bus • SCSI HBAs, SCSI RAID HBAs, HW RAID boxes • Interconnect • Many types can be supported • Remember, two NICs per node • PCI for cluster interconnect • Complete MSCS HCL configuration

  38. MSCS Setup • Most common problems • Duplicate SCSI IDs on adapters • Incorrect SCSI cabling • SCSI Card order on PCI bus • Configuration of SCSI Firmware • Let’s walk through getting a cluster operational

  39. Test Before You Build • Bring each system up independently • Network adapters • Cluster interconnect • Organization interconnect • SCSI and disk function • NTFS volume(s)

  40. Top Ten Setup “Concerns” 10. SCSI is not well known. Please use the MSCS and IHV setup documentation. Consider the SCSI book reference for this session 9. Build a support model that will support clustering requirements. For example, in clustering components are paired exactly (i.e., SCSI bios revision levels. Include this in your plans) 8. Build extra time into your deployment planning to accommodate cluster setup, both for hardware and software. Hardware examples include SCSI setup. Software issues would include installation across cluster nodes 7. Know the certification processand its support implications

  41. Top Ten Setup “Concerns” 6. Applications will become more cluster-aware throughtime. This will include better setup, diagnostics, and documentation. In the meantime, plan and test accordingly 5. Clustering will impact your server maintenanceand upgrade methodologies. Plan accordingly 4. Use multiple network adapters and hubs to eliminatesingle points of failure (everywhere possible) 3. Today’s clustering solutions are more complexto install and configure than single servers. Plan your deployments accordingly 2. Make sure that your cabinet solutions and peripherals both fit and function well. Consider the serviceability implications 1. Cabling is a nightmare. Color coded, heavilydocumented, Y cable inclusive, maintenance-designed products are highly desirable

  42. Cluster Management Tools • Cluster administrator • Monitor and manage cluster • Cluster CLI/COM • Command line and COM interface • Minor modifications to existing tools • Performance monitor • Add ability to watch entire cluster • Disk administrator • Add understanding of shared disks • Event logger • Broadcast events to all nodes

  43. In Search of Clusters; The Coming BattleIn Lowly Parallel Computing Gregory F. Pfister ISBN 0-13-437625-0 The Book of SCSI Peter M. Ridge ISBN 1-886411-02-6 MSCSReference Materials

  44. The Basics Of SCSI • Why SCSI? • Types of interfaces? • Caching and performance… • RAID • The future…

  45. Why SCSI? • Faster then IDE - intelligent card/drive • Uses less processor time • Can transfer data up to 100 MB/sec. • More devices on a single chain - up to 15 • Wider variety of devices • DASD • Scanners • CD-ROM writers and optical drives • Tape drives

  46. Types Of Interfaces • SCSI and SCSI II • 50-pin, 8-bit, max transfer = 10 MB/s (early 1.5 to 5 MB/s ) • Internal transfer rate = 4 to 8 MB/s • Wide SCSI • 68-pin, 16-bit, max transfer = 20 MB/s • Internal transfer rate = 7 to 15.5 MB/s • Ultra SCSI • 50-pin, 8-bit, higher transfer rate, max transfer = 20 MB/s • Internal transfer rate = 7 to 15.5 MB/s • Ultra wide • 68-pin, 16-bit, max transfer rate = 40 MB/s • Internal transfer rate = 7 to 30 MB/s

  47. Performance Factors • Cache on the drive or controller • Caching in the OS • Different variables • Seek time • Transfer rates

  48. Redundant Array Of Inexpensive Disks (RAID) • Developed from paper published in 1987at University of California Berkeley • The idea is to combine multiple inexpensive drives (eliminate SLED - single large expensive drive) • Provided redundancy by storing parity information

  49. The Future For SCSI • Faster interfaces - why? • Fibre Channel • Optical standard • Proposed as part of SCSI III (not final) • Up to 100 MB/s transfer • Still using ultra-wide SCSI inside enclosures • Drives with optical interfaces not available yet in quantity, higher cost than SCSI

  50. The Future Of SCIS • Fibre Channel-arbitrated loop • Ring instead of bus architecture • Can support up to 126 devices/hosts • Hot pluggable through the use of a port bypass circuit • No disruption of the loop as devices are added/removed • Generally implemented using a backplane design

More Related