410 likes | 530 Views
CIT 470: Advanced Network and System Administration. Servers and Services. Topics. SERVERS Servers vs Desktops Server Hardware Different Approaches to Servers SERVICES Service Requirements Open Architecture Service Design Principles. How are Servers different?.
E N D
CIT 470: Advanced Network and System Administration Servers and Services CIT 470: Advanced Network and System Administration
Topics SERVERS • Servers vs Desktops • Server Hardware • Different Approaches to Servers SERVICES • Service Requirements • Open Architecture • Service Design Principles CIT 470: Advanced Network and System Administration
How are Servers different? • 100s or 1000s of clients depend on server. • Requires high reliability. • Requires tighter security. • Often expected to last longer. • Investment amortized over many clients, longer lifetime. CIT 470: Advanced Network and System Administration
Vendor Product Lines Home • Cheapest purchase price. • Components change regularly based on cost. Business • Focuses on Total Cost of Ownership (TCO). • Slower hardware changes, longer lifetime. Server • Lowest cost per performance metric (nfs, web) • Easy to service rack-mountable chassis. • Higher quality (MIL-SPEC) components. CIT 470: Advanced Network and System Administration
Server Hardware • More internal space. • More CPU/Memory. • More / high-end CPUs. • More / faster memory. • High performance I/O. • PCIe vs PCI • SCSI/FC-AL vs. IDE • Rack mounted. • Redundancy • RAID • Hot-swappable hardware. CIT 470: Advanced Network and System Administration
Rack Mounting Efficient space utilization. • Simple, rectangular shape measured in RUs. • Repair and upgrade while mounted in rack. • No side access required. Requirements • Cooling through back, not sides. • Drives in front, cables in back. • Remote management (serial console, hw sensors) CIT 470: Advanced Network and System Administration
Server Memory Servers need more memory than desktops. • x86 supports up to 64GB with PAE. • x86-64 supports 1 PB (1024 TB) Servers need faster memory than desktops. • Higher memory speeds. • Multiple DIMMs accessed in parallel. • Larger CPU caches. CIT 470: Advanced Network and System Administration
Server CPUs Enterprise Processors • Intel Xeon (x86) • AMD Opteron (x86) • Itanium 2 • Sun UltraSPARC T1 • 4, 6, or 8 cores. • Each with 4 threads. • IBM POWER 5 • dual-core • Each with 2 threads. POWER 5 MCM with 4 dual-core HT CPUs + 4 36MB L3 cache chips. CIT 470: Advanced Network and System Administration
Xeon vs Pentium Xeon improvements • Faster L2 cache (Pentium-II/III) • Multiprocessing support (or >2 MP support) • Hyperthreading (before Pentium-4 could) • x86-64 support (before Pentium-4 could) • Larger L2 cache (Pentium-4) • Faster FSB (Pentium-4) CIT 470: Advanced Network and System Administration
System Buses Servers need high I/O throughput. • Fast peripherals: SCSI-3, Gigabit ethernet • Often use multiple and/or faster buses. PCI • Desktop: 32-bit 33 MHz, 133 MB/s • Server: 64-bit 66 MHz, 533 MB/s PCI-X (backward compatible) • v1.0: 64-bit 133 MHz, 1.06 GB/s • v2.0: 64-bit 533 MHz, 4.3 GB/s PCI Express (PCIe) • Serial architecture, v2.0 up to 16 GB/s CIT 470: Advanced Network and System Administration
Hardware Redundancy Disks are most likely component to fail. • Use RAID for disk redundancy. • Cover in detail in Disks lecture. Power supplies second most likely to fail. • Use redundant power supplies. • Many servers need 2 power supplies normally. • Need 3 power supplies for redundancy. • Use separate power cord and UPS for each power supply. CIT 470: Advanced Network and System Administration
Full and n+1 Redundancy n+1 Redundancy: One component can fail, but the system is still functional. • Ex: RAID 5, dual NICs with failover Full Redundancy: Two complete sets of hardware configured with failover mechanism. • Manual: SA switches to 2nd system when notices failure. • Automatic: The second system monitors the first and switches over automatically on failure. • Load-sharing: Both systems serve users, sharing load, but each has capacity to handle entire load on its own. When one fails, other automatically handles entire load. CIT 470: Advanced Network and System Administration
Hot-swap Components Hot-swap components • Components can be replaced while running. • Need n+1 redundancy for this to be useful. • Don’t need to schedule a downtime. Issues • Which parts are hot-swappable? • May require a few seconds to reconfigure. • Be sure components are hot-swap, not hot-plug. CIT 470: Advanced Network and System Administration
Hot Plug and Hot Spare Hot Plug • Electrically safe to replace component. • Part may not be recognized until next reboot. • Requires downtime, unlike hot swap. Hot Spare • Spare component already plugged into system. • System automatically uses hot spare when disk/CPU board etc. fails. • Provides n+2 redundancy. CIT 470: Advanced Network and System Administration
Separate Administrative Network Reliability • Allows access to machines even when network is down. Performance • Backups require so much bandwidth that they’re often done over their own network. Security • Network security monitoring data and logs sent across network should be secured. CIT 470: Advanced Network and System Administration
Maintenance Contracts • All machines eventually break. • Vendors offer variety of maint contracts. • Non-critical: Next-day or 2-day contract. • Clusters: If you have many similar hosts (CPU or web farm), then on-site spares may be cheaper than maintenance contract. • Controlled Model: Use small # of machine types for all servers, so you can afford a spares kit. • Critical Host: Same-day response or on-site spares. • Highly Critical: On-site technician + dup machine. CIT 470: Advanced Network and System Administration
Data Protection • Avoid desktop backups by storing data on servers. Easy on UNIX, harder on Windows. • Use RAID for server hardware failures. • Mirror root disk, higher RAID levels for data. • Some servers use 16GB Flash drives for root disk. • Doesn’t protect against software mistakes. • Server backups • Use specialized admin network to keep load off main network. • Use specialized tape jukeboxes to fully automate backups of large data servers (DBs, fileservers). CIT 470: Advanced Network and System Administration
Keep Servers in Data Center Data center necessary for server reliability. • Power (enough power, UPS) • Climate control (temperature, humidity) • Fire protection • High-speed network • Physical security CIT 470: Advanced Network and System Administration
Server OS Need greater reliability, security than desktop. • Remove unnecessary OS components. • Configure for best security & performance. Install and config specialized server software. • Server software: web, db, nfs, dns, ldap, etc. • May need monitoring software too. • Configuration: disk space, networking Server OS install should be automated too. CIT 470: Advanced Network and System Administration
Remote Administration Servers must be accessible remotely. • Allows SA to fix problems quickly at 3am. • Allows SA to work outside machine room. Remote Administration • Serial console and concentrator (UNIX) • Networked KVM (Windows) • Remote power control. • Important to secure remote admin facilities. CIT 470: Advanced Network and System Administration
Server Appliances Dedicated hardware + software • Fileserver (NetApp, Auspex) • Print servers • Routers Advantages • Performance • Reliability • Easy to setup • Extra capabilities Disadvantages • Cost CIT 470: Advanced Network and System Administration
Many Inexpensive Workstations Why buy svr hardware? • Buy two cheap rack-mount PCs + failover software. • Works if two PCs cheaper than server. • Google’s approach with ~450,000 servers. CIT 470: Advanced Network and System Administration
Blade Servers • High-density servers on a board. • CPU • Memory • Disk • Each blade lives in a blade chassis. CIT 470: Advanced Network and System Administration
Blade Chassis • Blade chassis provides power, network, remote. • Typically hot-swappable, hot-spare. • Racks can only support 1 svr/RU. • Blades are higher density, but also require more power and cooling. CIT 470: Advanced Network and System Administration
Servers vs Services A server is a piece of hardware. A service is the function that is provided by one or more servers. CIT 470: Advanced Network and System Administration
Services • Distinguish structured computing environment from some standalone PCs. • Large orgs linked through shared services to ease communication and optimize resources. • Typical environments have many services • Fundamental: net, DNS, email, auth, printing. • Typical: DHCP, backup, directory, file, license. • Services often depend on other services • Almost everything depends on DNS. CIT 470: Advanced Network and System Administration
Providing a Service • A service is more than hardware+software. • A service must be • Reliable. • Scalable. • Monitored. • Maintained. • Supported. CIT 470: Advanced Network and System Administration
Servers and Services For a service to be reliable, servers • Should be as simple as possible. • Should have minimum software to run service. • Should depend on as few other services. • Should depend only on services that are at least as reliable as the service running on the server. • Should have access restricted to SAs. • Should be as few as needed for performance and reliability. CIT 470: Advanced Network and System Administration
Customer Requirements Customers are the reason for the service. • How do they intend to use it? • What features do they need? • What features would they like to have? • How critical is the service? • What levels of availability and support are needed? Service Level Agreement (SLA) • Enumerates services. • Defines level of support. • Commits to response times for problem types. CIT 470: Advanced Network and System Administration
Operational Requirements Essential to designing a reliable service • What services does it depend upon? • What other services will depend upon it? • How does it interoperate with other services? • How can it be integrated with auth/dir services? • How does the service scale? • How can the service be upgraded? • Downtime requirements. • What systems are affected? CIT 470: Advanced Network and System Administration
Open Architecture Service should be built around open standards • Check IETF RFCs to see if it’s an open protocol. • Example service: SMTP • Example products: exim, postfix, qmail, sendmail. • Open standards don’t require open source. Allows vendors to make interoperable products. • Avoids vendor lock-in. • Allows vendor competition (cheaper prices for you.) • Decouples client selection from server selection. • Avoids need for protocol gateways. CIT 470: Advanced Network and System Administration
Requests for Comments (RFCs) • Documentation for Internet protocols, technologies, and methodologies. • Standards track RFCs describe Internet standards (TCP, IP, SMTP) and must be approved by IETF. • Experimental RFCs may become standards. • Best Common Practice RFCs describe how to run services or use protocols. • Informational RFCs is a catch-all including proprietary protocols, April Fool’s jokes, etc. • Available from http://www.rfc-editor.org/ CIT 470: Advanced Network and System Administration
Principles for Designing a Reliable Service Simplicity • The more features, the more bugs. • Simplicity increases reliability, ease of maintenance. Vendor Relations • Can be helpful about configuring service. • Let vendors compete for your business. • Stick to vendors who develop for your platform. CIT 470: Advanced Network and System Administration
Machine Independence Will eventually move service to new host. • Want to avoid having a downtime. • Want to avoid reconfiguring every desktop. Use generic DNS alias for machine • Mail server has name romero • DNS alias is smtp Use virtual IP addresses for non-name svcs • Machine has usual IP address: 192.168.1.54 • Virtual: ifconfig eth0:0 192.168.1.5 CIT 470: Advanced Network and System Administration
Dedicated Machines Put each service on its own machine(s). • If a server crashes, only impacts one service. • Easier to debug if only one service running. • Performance tuning easier with one service. • If you can’t afford a new machine, use a VM. CIT 470: Advanced Network and System Administration
Environment Safe environment • Improves reliability: AC, UPS, physical security. • Data center usually provides faster network too. • Only rely on services provided by data center. Restricted access • Customers should not need to login to servers. • More logins decrease stability, performance. • Even Windows can be stable w/o user logins. CIT 470: Advanced Network and System Administration
Principles for Designing a Reliable Service Service components should be tightly coupled. • Other than redundant components. • Share same power source, network. • Reduces service dependencies (single points of failure.) Centralize management of service • Managed by one set of SAs. • Support for service by single helpdesk. • Document service. CIT 470: Advanced Network and System Administration
Performance Latency vs throughput • Latency is delay before data received. • Throughput is how much data sent per second. • Performance problems typically affects one. • Increasing the other will not solve your problem. Remote sites • May have high latency to main site. • Do you need secondary servers at remote sites? CIT 470: Advanced Network and System Administration
Capacity Planning Estimate capacity from testing. • Test server at 100 qps, 200 qps, until slow. • Identify resources used by each query • RAM • Disk • Network • CPU Can service be split onto multiple servers? • Can it be done w/o users noticing? CIT 470: Advanced Network and System Administration
Principles for Designing a Reliable Service Monitoring • Availability, problems, performance. • Auto-alert front line support. • Customers shouldn’t discover problems before SA. • Capacity planning: CPU, mem, disk, network, licenses. Service Rollout • First impressions are difficult to change. • Be ready for support: docs, trained helpdesk. • Use one, some, many technique. CIT 470: Advanced Network and System Administration
References • Mark Burgess, Principles of System and Network Administration, Wiley, 2000. • Aeleen Frisch, Essential System Administration, 3rd edition, O’Reilly, 2002. • Evi Nemeth et al, UNIX System Administration Handbook, 3rd edition, Prentice Hall, 2001. • SAGE, SAGE Code of Ethics, http://www.sage.org/ethics.mm • Wikipedia, http://en.wikipedia.org/wiki/POWER5 CIT 470: Advanced Network and System Administration