260 likes | 347 Views
Recovery-Oriented Computing User Study. Training Materials October 2003. Overview. Informed consent & Introduction User study scenario & your role Training (20 minutes) Two study sessions (30 minutes each) Wrapup and questionnaire. Informed Consent.
E N D
Recovery-Oriented ComputingUser Study Training Materials October 2003
Overview • Informed consent & Introduction • User study scenario & your role • Training (20 minutes) • Two study sessions (30 minutes each) • Wrapup and questionnaire
Informed Consent • Please read the overview of the study and the informed consent form • please feel free to ask any questions you have about the experiment, its goals, its procedures, etc. • If you agree to participate in the experiment, please sign the informed consent form
Introduction • This study is evaluating new recovery tools • the tools are designed to help system administrators recover from problems affecting server systems • You will be playing the role of a system administrator • in each of two sessions, you will be trying to recover an e-mail server system from a pre-existing problem
Introduction (2) • In each session, you may (or may not) be given an experimental recovery tool to use • We are trying to understand when the tool is useful for you and when it is not • so if you are given the tool, please think carefully about whether or not to use it when you are attempting to recover from a problem • at the end of the session, you will be asked to explain why you chose to use (or not use) the tool
User Study Scenario • You are one of several system administrators of an electronic mail (e-mail) service • the administrators work in shifts • the study starts when you arrive for your shift • You arrive to find users complaining that the e-mail service is not working • you will be provided with details of the complaint • the e-mail failure may be caused by: • failure of the e-mail software, or • an error made by the administrator on the previous shift
User Study Scenario: Your Role • Your responsibilities and goals: • restore the e-mail service to normal operation as quickly as possible • minimize the amount of lost e-mail and user work • Note: • you should prioritize restoring service over preserving changes made by other administrators
User Study Scenario: Resources • Resources you will have: • a log of all actions performed by administrators in previous shifts • a day-old backup of the server’s file systems • the Internet • a test e-mail account • a guru • during each session, you may make up to one request for help to the guru • Plus any experimental recovery tool that we provide (described later)
E-mail Overview • This study concerns e-mail store servers • e-mail stores receive and store e-mail for their users • users’ mailboxes live on the e-mail store • they do not handle sending or routing of outgoing mail • E-mail stores use two protocols • SMTP: used to deliver incoming e-mail to a mailbox • SMTP is spoken between a remote server that sends the message, and the local recipient e-mail store server • IMAP: used to retrieve & manipulate mail in a mailbox • IMAP is spoken between a user’s e-mail client and their local e-mail store server
E-mail Server Configuration • Mailboxes are text files in /var/mail, e.g. /var/mail/user173 • sendmail: process that receives and delivers incoming e-mail • imapd: process that provides remote access to mailboxes • Mail store configuration files can be found in /etc/mail SMTPServerProcess sendmail IMAPServerProcess imapd SMTP IMAP Internet incominge-mail reading e-mail Users Mailboxes /var/mail/userNNN E-mail Server (Linux) undovmN.cs.berkeley.edu N={1,2,3}
Simple Familiarization Task • Take some time to get familiar with the console and the e-mail system • by performing a basic task as described below • Goals: • ensure sendmail is running • reconfigure server to recognize mail sent touser@roc.cs.berkeley.edu • restart sendmail to activate reconfiguration • First step: • connect to undovm3.cs.berkeley.edu with ssh continues...
Simple Familiarization Task (2) • Next, check if sendmail is running: • execute the command:ps ax | grep sendmail • Reconfigure server to accept new host name: • edit /etc/mail/local-host-names to add the line:roc.cs.berkeley.edu • Finally, restart sendmail: • run /etc/init.d/sendmail restart • Try this task now!
Recovery Tool: an Undo System • The undo system can undo administrative changes to the e-mail store, including: • changes to configuration files • software upgrades • deleted or altered files • It can be used to restore the e-mail server to a previously known-good state • by “rewinding” to a date when the system worked OK • The undo system preserves incoming e-mail and user mailbox changes
When Can the Undo System Help? • The undo system is useful: • when you cannot tell what is causing a problem • but you know that the system was working at some point in the past • when a problem affects system state • typically, the same cases where restoring a backup would fix the problem • It does not help when the problem does not affect state • like if a server process (e.g., sendmail) has crashed cleanly without corrupting state
Why Use the Undo System? • Unlike using a backup, the undo system also repairs the side effects of problems • example: if a problem caused e-mail to be lost, using undo to fix the problem will restore the lost e-mail • the undo system does this by recording incoming e-mail and users’ mailbox edits, then restoring them during recovery • Undo is also useful when you cannot diagnose a problem • simply undo the system to a point in time when it was known to be working
Undo System Operation • An undo cycle has two stages: • rewind: the e-mail system’s state is reverted to the way it appeared at a past time (the “rewind point”) • all changes to the system made since the rewind point are undone, including: • changes made by administrators • changes due to software bugs • incoming e-mail delivery and user mailbox edits • commit: makes the rewind permanent but restores incoming e-mail & user mailbox edits to present time • Net effect: undo cycle undoes all changes except incoming e-mail and mailbox edits
Illustration of Undo Cycle • Before undo: user event user events(incoming e-mail, mailbox edits) time admin changes admin change • After rewind: undone changes user events(incoming e-mail, mailbox edits) time admin changes Rewind point • After commit: restored user events user events(incoming e-mail, mailbox edits) time admin changes note that admin changes remain undone
Controls for the Undo System • Rewind: begins an undo cycle • defines a rewind point and undoes all later changes • may cause e-mail server to automatically reboot • takes 4 to 5 minutes to execute • Commit: completes the undo cycle • makes the rewind permanent • restores incoming e-mail & mailbox edits to present time • takes about 5 minutes to execute • Cancel: aborts the undo cycle • restores e-mail server to the state it was in before rewinding
Undo System Interface • Main window: normal state • time is divided into 5-minute intervals • each interval contains userevents like incoming mail • it’s fastest to rewind to a checkpoint Intervals Intervalscontainingcheckpoints Timeline(color indicatesrelative load) Checkpoints Current time Current undo status
Undo System Interface (2) • Main window: rewound state Current time (inthe past) indicatesundo point Current undo status History of undooperations Commit andCancel buttons
Undo System Interface (3) • Event window • used to initiate rewind • to view, double-click on an interval in main window Click to invokeundo cycle Selected event(rewind point) Current time Description of event(here, user170 is examining their mailbox) Event sequence #
Familiarization, Part II • Try out the undo system interface • note: actually performing an undo cycle may take 10 or more minutes to complete • Familiarize yourself with the various resources available to you during the study • Outlook Express e-mail client • the test e-mail account: user250@undovmN.cs.berkeley.edu N={1,2,3} • the system backup: /backup • books, documentation, the Internet • guru advice: at most one question per session
Resources for More Information • E-mail in general • About Internet email protocols http://perl.about.com/library/weekly/aa020600a.htm • E-mail references: http://www.newt.com/email/references.html • Sendmail • O’Reilly Sendmail book (next to your workstation) • Sendmail home page: http://www.sendmail.org • SMTP RFC: http://www.isi.edu/in-notes/rfc2821.txt • IMAPd • IMAP general info: http://www.imap.org/ • UW-IMAP home page: http://www.washington.edu/imap/ • IMAP RFC: http://www.isi.edu/in-notes/rfc3501.txt