830 likes | 1.17k Views
CIT 470: Advanced Network and System Administration. Administration Fundamentals. Topics. The Nature of System Administration Organizations and Certifications Change Management Remote Administration. What is System Administration?. What is a system?.
E N D
CIT 470: Advanced Network and System Administration Administration Fundamentals CIT 470: Advanced Network and System Administration
Topics • The Nature of System Administration • Organizations and Certifications • Change Management • Remote Administration CIT 470: Advanced Network and System Administration
What is a system? System: An organized collection of computers interacting with a group of users. Servers PCs Network run on run on Services Users help to accomplish work CIT 470: Advanced Network and System Administration
System State System policy: specification of a system’s configuration and its acceptable usage. System state S(t): the current configuration (files, kernel, memory or CPU usage) of a system. Ideal states S*(t): states of the system that match the system policy. Over time, the system state shifts away from the ideal state. System administration: modifying the system to bring it closer to S*(t). CIT 470: Advanced Network and System Administration
What do sysadmins do? Small org: sysadmin can be entire IT staff • Phone support • Order and install software and hardware • Fix anything that breaks from phones to servers • Develop software Large org: sysadmin is one of many IT staff • Specialists instead of “jack of all trades” • Database admin, Network admin, Fileserver admin, Help desk worker, Programmers, Logistics CIT 470: Advanced Network and System Administration
Common Activities • Add and remove users. • Add and remove hardware. • Perform backups. • Install new software systems. • Troubleshooting. • System monitoring. • Auditing security. • Help users. • Communicate. CIT 470: Advanced Network and System Administration
User Management Creating user accounts • Consistency requires automation • Startup (dot) files Namespace management • Usernames and UIDs • Multiple namespaces or SSI? Removing user accounts • Consistency requires automation • Many accounts across different systems CIT 470: Advanced Network and System Administration
Hardware Management Adding and removing hardware • Configuration, cabling, etc. Purchase • Evaluate and purchase servers + other hardware Capacity planning • How many servers? How much bandwidth, storage? Data Center management • Power, racks, environment (cooling, fire alarm) Virtualization • When can virtual servers be used vs. physical? CIT 470: Advanced Network and System Administration
Backups Backup strategy and policies • Scheduling: when and how often? • Capacity planning • Location: on-site vs. off-site. Monitoring backups • Checking logs • Verifying media Performing restores when requested CIT 470: Advanced Network and System Administration
Software Installation Automated consistent OS installs • Desktop vs. server OS image needs. Installation of software • Purchase, find, or build custom software. Managing software installations • Distributing software to multiple hosts. • Managing multiple versions of a software pkg. Patching and updating software CIT 470: Advanced Network and System Administration
Troubleshooting Problem identification • By user notification • By log files or monitoring programs Tracking and visibility • Ensure users know you’re working on problem • Provide an ETA if possible Finding the root cause of problems • Provide temporary solution if necessary • Solve the root problem to permanently eliminate CIT 470: Advanced Network and System Administration
System Monitoring Automatically monitor systems for • Problems (disk full, error logs, security) • Performance (CPU, mem, disk, network) Provides data for capacity planning • Determine need for resources • Establish case to bring to management CIT 470: Advanced Network and System Administration
Helping Users Request tracking system • Ensures that you don’t forget problems. • Ensures users know you’re working on their problem; reduces interruptions, status queries. • Lets management know what you’ve done. User documentation and training • Policies and procedures Schedule and communicate downtimes CIT 470: Advanced Network and System Administration
Communicate Customers • Keep customer appraised of process. • When you’ve started working on a request with ETA. • When you make progress, need feedback. • When you’re finished. • Communicate system status. • Uptime, scheduled downtimes, failures. • Meet regularly with customer managers. Managers • Meet regularly with your manager. • Write weekly status reports. CIT 470: Advanced Network and System Administration
Specialized Skills Heterogeneous Environments Integrating multiple-OSes, hardware types, or network protocols, distributed sites. Databases SQL RDMS Networking Complex routing, high speed networks, voice. Security Firewalls, authentication, NIDS, cryptography. Storage NAS, SANs, cloud storage. Virtualization and Cloud Computing VMware, cloud architectures. CIT 470: Advanced Network and System Administration
Qualities of a Successful Sysadmin Customer oriented • Ability to deal with interrupts, time pressure • Communication skills • Service provider, not system police Technical knowledge • Hardware, network, and software knowledge • Proficiency with at least one scripting language • Debugging and troubleshooting skills Time management • Automate everything possible. • Ability to prioritize tasks: urgency and importance.
First Steps to Better SA Use a request system. • Customers know what you’re doing. • You know what you’re doing. Manage quick requests right • Handle emergencies quickly. • Use request system to avoid interruptions. Policies • How do people get help? • What is the scope of responsibility for SA team? • What is our definition of emergency? Start every host in a known state. CIT 470: Advanced Network and System Administration
Principles of SA Simplicity • Choose the simplest solution that solves the entire problem. • Work towards a predictable system. Clarity • Choose a straightforward solution that’s easy to change, maintain, debug, and explain to other SAs. Generality • Choose reusable solutions that scale up; use open protocols. Automation • Use software to replace human effort. Communication • Be sure that you’re solving the right problems and that people know what you’re doing. Basics First • Solve basic infrastructure problems before advanced ones. CIT 470: Advanced Network and System Administration
Organizations USENIX: Advanced Computing Systems Association LISA: Large Installation System Administration SAGE: System Administration Guild LOPSA: League of Professional System Administrators CIT 470: Advanced Network and System Administration
Types of Sites Small 2-10 computers, 1 OS, 2-20 users. Small staff size requires outsourcing to obtain most specialized skills. Midsized 11-100 computers, 1-3 OSes, 21-100 users. Large 100+ computers, multiples OSes, 100+ users Outsources to reduce costs, some specializations. CIT 470: Advanced Network and System Administration
Certifications • CCNA, CCNP, CCIE (Cisco) • cSAGE (SAGE) • MCSA (Microsoft) • RHCE (Red Hat) • SCSA (Sun) • VCP (VMware) CIT 470: Advanced Network and System Administration
SAGE Job Descriptions Novice OS familiarity, help desk skills Junior Can use OS system administration tools (370) Intermediate Understanding of distributed computing, common servers, automate small tasks, independent action Senior Understanding of scaling issues, including capacity planning, solve problems by addressing root cause, higher level programming abilities, write proposals for purchasing, data center planning, etc. CIT 470: Advanced Network and System Administration
SA Maturity Model (SAMM) • Ad Hoc Ad-hoc non-repeatable solutions, firefighting. • Repeatable Some repeatable processes. • Defined Documented standard processes • Managed Process effectiveness measured, adapted. • Optimized CIT 470: Advanced Network and System Administration
Maturity and Complexity Low downtime, high efficiency Scalable but time lost in process. Maturity Constant firefighting, high downtime Works, but hard to scale up. Complexity: increasing numbers of systems and/or services CIT 470: Advanced Network and System Administration
Tool Maturity Levels • Ad Hoc OS GUI, CLI, or web administration interfaces. • Repeatable Version control (RCS, SVN, GIT), request tracker • Defined Automatic monitoring (Nagios, monit, god) • Managed Configuration management (AutomateIt, cfengine) • Optimized CIT 470: Advanced Network and System Administration
SAGE Code of Ethics • Professionalism • Personal Integrity • Privacy • Laws and Policies • Communication • System Integrity • Education • Social Responsibility http://www.sage.org/ethics/ CIT 470: Advanced Network and System Administration
Terry Childs Case Network administrator for San Francisco • CCIE who built city’s FiberWAN network Terry was only person with router passwords • IT department acknowledges knowing that • He was on-call 24x7x365 to resolve issues Terry refused to give passwords to boss • Cited fears that they would be misused by management, outside contractors. What was the right thing for Terry to do? CIT 470: Advanced Network and System Administration
Change Management Effective planning and implementation of changes to systems. Changes should be • Well documented. • Have a backout plan. • Reproducible. CIT 470: Advanced Network and System Administration
Why do we need Change Management? March 26-29, 2006: BART trains halted to avoid running into each other when computer systems crashed. • Crashes on Monday/Tuesday resulted from software maintenance upgrades. • Crash on Wednesday resulted from installing a backup system to avoid future crashes. • Thousands of passengers stranded for several hours each time. CIT 470: Advanced Network and System Administration
Change Management • Plan change. • Test change on single system. • Test change on multiple systems. • File a change request. • Change committee approves request. • Schedule change. • Communication with users/admins. • Change systems at scheduled time. • Post-event analysis. CIT 470: Advanced Network and System Administration
Testing Changes • Automated checks. • Sanity checks like Samba testparm. • Reboot system. • Test on one system first. • Then test on set of systems. • Dedicated test systems. • System admin workstations. • Virtual machines. CIT 470: Advanced Network and System Administration
When do you need a Change Proposal? Does the change impact critical services? Critical machines/services • Business critical: e-commerce server, etc. • Essential services: routers, DNS, NFS, auth. Non-critical machines/services • Individual desktops • Internal news web server CIT 470: Advanced Network and System Administration
Change Proposal • Description of the change. • Systems impacted by change. • Why the change is being made. • Risks presented by the change. • Test procedure. • Backout plans. • How long the change will require. CIT 470: Advanced Network and System Administration
Communication Communicate change to impacted people. • What change is being made (nontechnical.) • Which services will be unavailable. • When and how long will they be unavailable. • What actions do they need to task (if any.) Communication issues • If you send too many notes, they’ll be ignored. • Send notices only to those impacted. • Push critical notices; use pull for non-critical. CIT 470: Advanced Network and System Administration
Scheduling CIT 470: Advanced Network and System Administration
Change Freezes Time when only minor updates can be done. • End of quarter or year. • “Crunch time” for projects. CIT 470: Advanced Network and System Administration
Backing Out Decide back-out conditions before downtime • Avoid the “just 5 more minutes” problem. • Be sure that someone is keeping track of time. Questions: • How much time is required for back out? • When is the latest time you can successfully back out? • Will backing out this change prevent other changes from being committed? CIT 470: Advanced Network and System Administration
Backing Out: How to do it? Service-level changes Use revision control system to revert config. Machine-level changes Soft cutover: Old service is still running. Hard cutover: Power up old server or restore from backups. Snapshots Snapshot VM before making change. Revert to snapshot if need to backout. Issues Data migration and compatibility CIT 470: Advanced Network and System Administration
Automatic Checks Check integrity of critical files before use. • Some services provide checks: LDAP, SMB. • Check startup files by rebooting machine. • Write your own checks for other files. • Most people only do this after they have a problem. CIT 470: Advanced Network and System Administration
Remote Administration • Network Access • SSH • Key-based Authentication • Console Access • X-Windows • VNC and NX • SSH tunneling
Network Access Most tasks can be done from the shell. File management. Disk/volume management. Troubleshooting and viewing logs. Installing/removing software. Start/stop network services. Reboot/shutdown. All we need is a way to invoke a shell across the network. CIT 470: Advanced Network and System Administration
telnet Ubiquitous network terminal protocol telnet hostname Similar protocols rlogin –l user hostname rsh –l user hostname command Insecure Data, including passwords, sent in the clear. rlogin/rsh use ~/.rhosts for access w/o passwords. CIT 470: Advanced Network and System Administration
ssh Secure Shell Replaces telnet ftp rlogin rsh rcp CIT 470: Advanced Network and System Administration
SSH Security Features CIT 470: Advanced Network and System Administration
OpenSSH SSH Tectia F-secure SSH Putty WinSCP SSH v1 Insecure, obsolete. Do not use. SSH v2 Current version. SSH: Protocols and Products CIT 470: Advanced Network and System Administration
SSH Features Secure login ssh –l user host Secure remote command execution ssh –l user host command Secure file transfer sftp –l user host scp file user@host:/tmp/myfile Port forwarding ssh –L 110:localhost:110 mailhost CIT 470: Advanced Network and System Administration
The Problem of Passwords • Good passwords are hard to remember. • Password transferred to remote system. • Automating remote access with passwords is difficult. CIT 470: Advanced Network and System Administration