490 likes | 511 Views
Explore NeoCLUSTER, a software solution for building highly available server clusters on Windows NT platform. With features like fault recovery and intuitive GUI, it ensures continuous and reliable services. Learn how it works and why it's the ideal choice for mission-critical applications.
E N D
High Availability Software for Windows NTNeoCLUSTER WHY : Demand for Availability WHAT : Technology and Product HOW : Configuration
Demand for Availability • Information is a capital asset of an organization. • The server systems for archiving, processing, and conveying information must be constantly monitored and carefully managed to provide reliable, timely, and continuous services. • Down time is inevitable • Scheduled and Unscheduled
Trends • Distributed processing and multi-tier client/server applications • Multiple servers are collaborated to improve • Load sharing • Performance • Availability • Windows NT is becoming a major server platform for mission-critical applications.
Factors of System Availability • CPU, memory, I/O cards 24% • Disk 27% • Application software 22% • Common hardware & 21% system software • Human error 6% Source : Strategic Research Division of Find SPV
System Availability Hierarchy Applications Cost Technology Hosts I/O Paths Storage Subsystem Disk & Tape
Ask Your Customers • Do you use Windows NT as the platform of your mission critical applications? • Does system downtime mean losses to you? • Do you need a technically and economically affordable solution to make your NT servers fault resilient? • Do you need a guarding angel to watch over your NT servers around the clock so that you can sleep better at night?
Level of System Availability • Non-stop systems: Stratus, Tandem, Netware SFT III • Tightly coupled, fully duplicated configuration • Proprietary OS • Non-redundant systems • Hot-plug and self-diagnostic hardware components • Auto-retry and pro-active software
Level of System Availability • High Availability systems • A cluster of loosely coupled servers • Software based implementation • Provide better availability/price ratio than non-stop systems
Cluster • Server farm : Single Network Identity • Database Cluster : Cluster Manager, Distributed Lock Manager • Computing Cluster : Parallel Computing
NeoCLUSTER • A pure software solution for building highly available server cluster • Microsoft Windows NT server standard edition version 4.0 • I386 and Alpha platforms • Functions • Cluster configuration and administration • Failure detection, logging, notification, isolation, and recovery
Features • Technically and economically affordable • Fully compatible with Windows NT • Require no software modification or proprietary hardware • No single point of failure • Reliable and efficient mechanism for error detection and fault recovery
Features • Intuitive and user friendly Windows GUI • Fully user configurable • Support automatic and manual switch back • Negligible impact on resource consumption and server performance. • Minimum human intervention • No intrusion to routine workflow
Servers • Active Server is a pre-designated computer responsible for providing critical services that will be guarded by NeoCLUSTER. • Backup Server is apre-designated computer that will takeover the active server under the administration of NeoCLUSTER. • Neither identical configured servers nor dedicated backup server is required
Private Network • Dedicated interconnect for inter-server communication. • Three types of interconnect for redundancy • TCP/IP : back to back or LAN connection of two network interface cards • RS-232 : serial cable with null modem support to connect two COM ports • Disk volume : two dedicate partitions on the shared disks
Private Network • All instances of private net were unavailable • A server can still rely on the public net to detect the availability of the peer server. • If the peer server is still available, no takeover action will be triggered. • If the peer server was unavailable, a takeover action will be activate immediately.
Public Network • Dedicated network for clients to access servers. • TCP/IP and NetBEUI protocols • Each active server will carry a switchable network ID(i.e.,IP address or computer name) • The original network IDs of both servers can remain intact. • Clients will connect to the switchable network ID. • If the active server was unavailable, the backup server will takeover the switchable network ID.
Public Network • NeoCLUSTER provides built-in mechanism to identify network failure problem. • Self-diagnostic of network availability • Supported NICs : Intel EtherExpress PRO/100B, 3Com 3C905B, DEC 21x4x. • Supported NIC add-on software : NIC Express from IPMetrics(load balancing and fault-tolerance).
Private Drives and Public Drives • Private drives are disk volumes for storing OS and the data that is not required to be accessible by the backup server. • Public drives are disk volumes on the shared disks for storing the application software and related data that must be accessible by the backup server. • Shared SCSI bus or independent host channels • Mirroring or RAID subsystems.
Clients • Computer systems that access the active servers via TCP/IP or NetBEUI protocols.
Resource Object Administration Tool Cluster Monitor Service Agent Script Cluster Service Windows NT Operation Scenario:Software Perspective • Block diagram
Active Server Backup Server Resource Object Cluster Service Cluster Service Server Heartbeat Resource Monitoring Agent Heartbeat Cluster Monitor Service Agent Operation Scenario:Software Perspective • Module interaction of NeoCLUSTER
Cluster Service and Cluster Monitor Service • The core processes of NeoCLUSTER • Two mutual-guarded NT services • user transparent auto-restart • Functions • Resource objects management • Event logging and notification • Fault isolation and recovery
Server Heartbeat • Periodic messages • Servers exchange heartbeats with each other over the private net • Inform the receiving server the availability of the sending server
Resource Object • Components of mission critical services • Repository of service related files : Volume • Switchable network identity for clients to access the services : IP Address or Computer Alias Name • The service itself : File Share, NT Services, or User Defined
Resource Object • Volume • Disk partitions on the public drives. • The drive letter mapping and partition information of a volume must be identical when viewed from both servers. This ensures that no matter which server is the active server, the volume can be accessed with the same drive letter. • NeoCLUSTER provides “volume locking” to ensure exclusive volume access.
Resource Object • IP Address • A switchable network identity for TCP/IP. • Computer Alias Name • A switchable network identity for NetBEUI. • File Share • Shared directories that are accessible by clients. • Both servers must use the same share name.
Resource Object • NT Services • Most application software for Windows NT are implemented as NT services. • User Defined • For configuring the application software that is not implemented as NT services. • For grouping related resource objects into resource hierarchy.
Resource Hierarchy • Each mission critical service is formulated and manipulated as a resource hierarchy
Resource Hierarchy • A resource hierarchy is an integrated entity. • A resource hierarchy identifies the required resource objects and the proper sequence to activate those resource objects. • A single resource object is a generic resource hierarchy.
Agents • Windows NT executable files • Availability monitoring and error detection • Intelligent and light-weighted • Least system resource consumption • Minimum impact on system performance • Efficient and reliable • No critical failure will be neglected • Real-time respond to failure to reduce downtime • No false alarm
Agents • Built-in agents • Server, public net, public drives • Resource objects • Agent API and template • Custom agent development • An open interface to communicate and interact with other programmable third party hardware and software management tools
Agent Heartbeat • Periodic messages • Agent send heartbeats to the Cluster Service to inform the Cluster Service the availability of the resource object monitored by the agent
Scripts • Windows NT executable files • Auto-initiated • Start a series of programs • Terminate a series of programs • Monitoring a series of programs • Trigger event notification programs
Administration Tool • Intuitive and user friendly • Interactive point-and-click Windows GUI • Menu-driven and form-based interface • Icon-based real-time status monitoring • Support dynamic configuration and real-time synchronization • Remote administration using Web browser is freely available from third parties
Availability Recovery • Critical factors of failover/takeover : Volume, NT Service, User Defined • Mechanisms • Failover is initiated by the active server • Takeover is initiated by the backup server • Failover/Takeover • The active server deactivate corresponding resource hierarchy • The backup server reactivate the resource hierarchy
Availability Recovery • Switch back/Fail back • Switch a resource hierarchy back to the original active server from the backup server • The original active server has recovered • The backup server detects that the active server has recovered • Retain the original load distribution • Asymmetric configuration : active/backup servers with different capacity • Symmetric configuration : two active, mutual takeover
Clients • Client-end applications will connect to switchable network IDs • No need to reconfigure or modify the client-end applications • Reconnection after a failover operation is application dependent
Clients • Stateless applications • NFS service or UDP-based applications • User transparent • Stateful applications • Client/server RDBMS applications or TCP-based applications • The client applications will loose their connection to the server • Manually reconnect to server is required
Supported Application • File Sharing • Printer Spooler • Internet Servers(FTP, WWW, etc.) • RDBMS(Microsoft, Oracle, Sybase, Informix) • Microsoft Exchange Server, Lotus Notes Server • NT Service-based application software • TCP/IP or NetBEUI-based client/server applications
Future Improvements • Multiple error notification facilities • Server side visual and audio alarm • Message broadcasting • E-mail • Pager • SNMP agent • Simplified GUI • N to 1 cluster configuration
Supported Configurations • Active/Backup
Supported Configuration • Active/Active