230 likes | 451 Views
CIT 470: Advanced Network and System Administration. Change and Configuration Management. Topics. Change Management Change Processes Revision Control Configuration Management cfengine. Images from Pro Git. Change Management. Effective planning and implementation of changes to systems.
E N D
CIT 470: Advanced Network and System Administration Change and Configuration Management CIT 470: Advanced Network and System Administration
Topics • Change Management • Change Processes • Revision Control • Configuration Management • cfengine Images from Pro Git CIT 470: Advanced Network and System Administration
Change Management Effective planning and implementation of changes to systems. Changes should be • Well documented. • Have a backout plan. • Reproducible. CIT 470: Advanced Network and System Administration
Why do we need Change Management? March 26-29, 2006: BART trains halted to avoid running into each other when computer systems crashed. • Crashes on Monday/Tuesday resulted from software maintenance upgrades. • Crash on Wednesday resulted from installing a backup system to avoid future crashes. • Thousands of passengers stranded for several hours each time. CIT 470: Advanced Network and System Administration
Change Management • Plan change. • Test change on single system. • Test change on multiple systems. • File a change request. • Change committee approves request. • Schedule change. • Communication with users/admins. • Change systems at scheduled time. • Post-event analysis. CIT 470: Advanced Network and System Administration
Testing Changes • Automated checks. • Sanity checks like Samba testparm. • Reboot system. • Test on one system first. • Then test on set of systems. • Dedicated test systems. • System admin workstations. • Virtual machines. CIT 470: Advanced Network and System Administration
When do you need a Change Proposal? Does the change impact critical services? Critical machines/services • Business critical: e-commerce server, etc. • Essential services: routers, DNS, NFS, auth. Non-critical machines/services • Individual desktops • Internal news web server CIT 470: Advanced Network and System Administration
Change Proposal • Description of the change. • Systems impacted by change. • Why the change is being made. • Risks presented by the change. • Test procedure. • Backout plans. • How long the change will require. CIT 470: Advanced Network and System Administration
Communication Communicate change to impacted people. • What change is being made (nontechnical.) • Which services will be unavailable. • When and how long will they be unavailable. • What actions do they need to task (if any.) Communication issues • If you send too many notes, they’ll be ignored. • Send notices only to those impacted. • Push critical notices; use pull for non-critical. CIT 470: Advanced Network and System Administration
Scheduling CIT 470: Advanced Network and System Administration
Change Freezes Time when only minor updates can be done. • End of quarter or year. • “Crunch time” for projects. CIT 470: Advanced Network and System Administration
Backing Out Decide back-out conditions before downtime • Avoid the “just 5 more minutes” problem. • Be sure that someone is keeping track of time. Questions: • How much time is required for back out? • When is the latest time you can successfully back out? • Will backing out this change prevent other changes from being committed? CIT 470: Advanced Network and System Administration
Backing Out: How to do it? Service-level changes Use revision control system to revert config. Restart service. Machine-level changes Soft cutover: Old service is still running. Hard cutover: Power up old server or restore from backups. Issues Data migration. Compatibility. CIT 470: Advanced Network and System Administration
Automatic Checks Check integrity of critical files before use. • Some services provide checks: LDAP, SMB. • Check startup files by rebooting machine. • Write your own checks for other files. • Most people only do this after they have a problem. CIT 470: Advanced Network and System Administration
Revision Control Revision control systems provide Conflict management: prevents multiple people from modifying file at once and corrupting it. Change history: records who modified the file when and why the change was made. Revision control paradigms Lock-Modify-Unlock: rcs Copy-Modify-Merge: cvs, subversion, etc. Distributed: darcs, git, mercurial CIT 470: Advanced Network and System Administration
Local Version Control CIT 470: Advanced Network and System Administration
Centralized Version Control CIT 470: Advanced Network and System Administration
Distributed Version Control CIT 470: Advanced Network and System Administration
Local Git Operations CIT 470: Advanced Network and System Administration
Git File Lifecycle CIT 470: Advanced Network and System Administration
Gitk history visualizer CIT 470: Advanced Network and System Administration
References • Mark Burgess, Principles of Network and System Administration, 2nd edition, Wiley, 2004. • Aeleen Frisch, Essential System Administration, 3rd edition, O’Reilly, 2002. • Thomas A. Limoncelli and Christine Hogan, The Practice of System and Network Administration, Addison-Wesley, 2002. • Evi Nemeth et al, UNIX System Administration Handbook, 3rd edition, Prentice Hall, 2001. • Todd R. Weiss, “IT upgrades slow BART trains in San Francisco,” http://www.computerworld.com/printthis/2006/0,4814,110107,00.html, ComputerWorld, March 31, 2006. CIT 470: Advanced Network and System Administration