1 / 48

High Availability Software for Windows NT NeoCLUSTER

High Availability Software for Windows NT NeoCLUSTER. WHY : Demand for Availability WHAT : Technology and Product HOW : Configuration. Demand for Availability. Information is a capital asset of an organization.

micahs
Download Presentation

High Availability Software for Windows NT NeoCLUSTER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Availability Software for Windows NTNeoCLUSTER WHY : Demand for Availability WHAT : Technology and Product HOW : Configuration

  2. Demand for Availability • Information is a capital asset of an organization. • The server systems for archiving, processing, and conveying information must be constantly monitored and carefully managed to provide reliable, timely, and continuous services. • Down time is inevitable • Scheduled and Unscheduled

  3. Trends • Distributed processing and multi-tier client/server applications • Multiple servers are collaborated to improve • Load sharing • Performance • Availability • Windows NT is becoming a major server platform for mission-critical applications.

  4. Factors of System Availability • CPU, memory, I/O cards 24% • Disk 27% • Application software 22% • Common hardware & 21% system software • Human error 6% Source : Strategic Research Division of Find SPV

  5. System Availability Hierarchy Applications Cost Technology Hosts I/O Paths Storage Subsystem Disk & Tape

  6. Ask Your Customers • Do you use Windows NT as the platform of your mission critical applications? • Does system downtime mean losses to you? • Do you need a technically and economically affordable solution to make your NT servers fault resilient? • Do you need a guarding angel to watch over your NT servers around the clock so that you can sleep better at night?

  7. YES!!

  8. Level of System Availability • Non-stop systems: Stratus, Tandem, Netware SFT III • Tightly coupled, fully duplicated configuration • Proprietary OS • Non-redundant systems • Hot-plug and self-diagnostic hardware components • Auto-retry and pro-active software

  9. Level of System Availability • High Availability systems • A cluster of loosely coupled servers • Software based implementation • Provide better availability/price ratio than non-stop systems

  10. Cluster • Server farm : Single Network Identity • Database Cluster : Cluster Manager, Distributed Lock Manager • Computing Cluster : Parallel Computing

  11. Realtime Data Replication

  12. NeoCLUSTER • A pure software solution for building highly available server cluster • Microsoft Windows NT server standard edition version 4.0 • I386 and Alpha platforms • Functions • Cluster configuration and administration • Failure detection, logging, notification, isolation, and recovery

  13. Features • Technically and economically affordable • Fully compatible with Windows NT • Require no software modification or proprietary hardware • No single point of failure • Reliable and efficient mechanism for error detection and fault recovery

  14. Features • Intuitive and user friendly Windows GUI • Fully user configurable • Support automatic and manual switch back • Negligible impact on resource consumption and server performance. • Minimum human intervention • No intrusion to routine workflow

  15. Operation Scenario:Hardware Perspective

  16. Servers • Active Server is a pre-designated computer responsible for providing critical services that will be guarded by NeoCLUSTER. • Backup Server is apre-designated computer that will takeover the active server under the administration of NeoCLUSTER. • Neither identical configured servers nor dedicated backup server is required

  17. Private Network • Dedicated interconnect for inter-server communication. • Three types of interconnect for redundancy • TCP/IP : back to back or LAN connection of two network interface cards • RS-232 : serial cable with null modem support to connect two COM ports • Disk volume : two dedicate partitions on the shared disks

  18. Private Network • All instances of private net were unavailable • A server can still rely on the public net to detect the availability of the peer server. • If the peer server is still available, no takeover action will be triggered. • If the peer server was unavailable, a takeover action will be activate immediately.

  19. Public Network • Dedicated network for clients to access servers. • TCP/IP and NetBEUI protocols • Each active server will carry a switchable network ID(i.e.,IP address or computer name) • The original network IDs of both servers can remain intact. • Clients will connect to the switchable network ID. • If the active server was unavailable, the backup server will takeover the switchable network ID.

  20. Public Network • NeoCLUSTER provides built-in mechanism to identify network failure problem. • Self-diagnostic of network availability • Supported NICs : Intel EtherExpress PRO/100B, 3Com 3C905B, DEC 21x4x. • Supported NIC add-on software : NIC Express from IPMetrics(load balancing and fault-tolerance).

  21. Private Drives and Public Drives • Private drives are disk volumes for storing OS and the data that is not required to be accessible by the backup server. • Public drives are disk volumes on the shared disks for storing the application software and related data that must be accessible by the backup server. • Shared SCSI bus or independent host channels • Mirroring or RAID subsystems.

  22. Clients • Computer systems that access the active servers via TCP/IP or NetBEUI protocols.

  23. Resource Object Administration Tool Cluster Monitor Service Agent Script Cluster Service Windows NT Operation Scenario:Software Perspective • Block diagram

  24. Active Server Backup Server Resource Object Cluster Service Cluster Service Server Heartbeat Resource Monitoring Agent Heartbeat Cluster Monitor Service Agent Operation Scenario:Software Perspective • Module interaction of NeoCLUSTER

  25. Cluster Service and Cluster Monitor Service • The core processes of NeoCLUSTER • Two mutual-guarded NT services • user transparent auto-restart • Functions • Resource objects management • Event logging and notification • Fault isolation and recovery

  26. Server Heartbeat • Periodic messages • Servers exchange heartbeats with each other over the private net • Inform the receiving server the availability of the sending server

  27. Resource Object • Components of mission critical services • Repository of service related files : Volume • Switchable network identity for clients to access the services : IP Address or Computer Alias Name • The service itself : File Share, NT Services, or User Defined

  28. Resource Object • Volume • Disk partitions on the public drives. • The drive letter mapping and partition information of a volume must be identical when viewed from both servers. This ensures that no matter which server is the active server, the volume can be accessed with the same drive letter. • NeoCLUSTER provides “volume locking” to ensure exclusive volume access.

  29. Resource Object • IP Address • A switchable network identity for TCP/IP. • Computer Alias Name • A switchable network identity for NetBEUI. • File Share • Shared directories that are accessible by clients. • Both servers must use the same share name.

  30. Resource Object • NT Services • Most application software for Windows NT are implemented as NT services. • User Defined • For configuring the application software that is not implemented as NT services. • For grouping related resource objects into resource hierarchy.

  31. Resource Hierarchy • Each mission critical service is formulated and manipulated as a resource hierarchy

  32. Resource Hierarchy • A resource hierarchy is an integrated entity. • A resource hierarchy identifies the required resource objects and the proper sequence to activate those resource objects. • A single resource object is a generic resource hierarchy.

  33. Agents • Windows NT executable files • Availability monitoring and error detection • Intelligent and light-weighted • Least system resource consumption • Minimum impact on system performance • Efficient and reliable • No critical failure will be neglected • Real-time respond to failure to reduce downtime • No false alarm

  34. Agents • Built-in agents • Server, public net, public drives • Resource objects • Agent API and template • Custom agent development • An open interface to communicate and interact with other programmable third party hardware and software management tools

  35. Agent Heartbeat • Periodic messages • Agent send heartbeats to the Cluster Service to inform the Cluster Service the availability of the resource object monitored by the agent

  36. Scripts • Windows NT executable files • Auto-initiated • Start a series of programs • Terminate a series of programs • Monitoring a series of programs • Trigger event notification programs

  37. Administration Tool • Intuitive and user friendly • Interactive point-and-click Windows GUI • Menu-driven and form-based interface • Icon-based real-time status monitoring • Support dynamic configuration and real-time synchronization • Remote administration using Web browser is freely available from third parties

  38. Administration Tool

  39. Availability Recovery • Critical factors of failover/takeover : Volume, NT Service, User Defined • Mechanisms • Failover is initiated by the active server • Takeover is initiated by the backup server • Failover/Takeover • The active server deactivate corresponding resource hierarchy • The backup server reactivate the resource hierarchy

  40. Availability Recovery • Switch back/Fail back • Switch a resource hierarchy back to the original active server from the backup server • The original active server has recovered • The backup server detects that the active server has recovered • Retain the original load distribution • Asymmetric configuration : active/backup servers with different capacity • Symmetric configuration : two active, mutual takeover

  41. Clients • Client-end applications will connect to switchable network IDs • No need to reconfigure or modify the client-end applications • Reconnection after a failover operation is application dependent

  42. Clients • Stateless applications • NFS service or UDP-based applications • User transparent • Stateful applications • Client/server RDBMS applications or TCP-based applications • The client applications will loose their connection to the server • Manually reconnect to server is required

  43. Supported Application • File Sharing • Printer Spooler • Internet Servers(FTP, WWW, etc.) • RDBMS(Microsoft, Oracle, Sybase, Informix) • Microsoft Exchange Server, Lotus Notes Server • NT Service-based application software • TCP/IP or NetBEUI-based client/server applications

  44. Future Improvements • Multiple error notification facilities • Server side visual and audio alarm • Message broadcasting • E-mail • Pager • SNMP agent • Simplified GUI • N to 1 cluster configuration

  45. Supported Configurations • Active/Backup

  46. Supported Configuration • Active/Active

  47. Supported Configuration

More Related