1 / 55

The Effects of Interface Design on Telephone Dialing Performance

The Effects of Interface Design on Telephone Dialing Performance. Master’s thesis in Computer Science Andrew R. Freed 4/30/2003. The Effects of Interface Design on Telephone Dialing Performance. Towards automatic interface evaluation Methods of evaluation Experiment design Three analyses

graceland
Download Presentation

The Effects of Interface Design on Telephone Dialing Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Effects of Interface Design on Telephone Dialing Performance Master’s thesis in Computer Science Andrew R. Freed 4/30/2003

  2. The Effects of Interface Design on Telephone Dialing Performance • Towards automatic interface evaluation • Methods of evaluation • Experiment design • Three analyses • Comparison of analyses • Further work

  3. Towards automatic interface evaluation • Why not test with actual users instead? • It takes too much time and money! • Automatic evaluation has been useful in the past (Project Ernestine - Gray et al 1992) to the tune of $2.4M savings/year • Several proposed tools will make this type of evaluation easier

  4. Towards automatic interface evaluation • Motivation: • Eye-tracking studies by Byrne (1999, 2001) and Hornof (1997) • Cognitive models as surrogate users (Ritter 2001)

  5. Towards automatic interface evaluation • 100 phones to choose from • Selected 10 for analysis

  6. Towards automatic interface evaluation • 10 tasks (Ritter 2000) • 1. Call home (*) • 2. Call work (*) • 3. Redial last number (*) • 4. Call directory inquiries • 5. Call mother (*) • 6. Conference call work and home (*) • 7. Conference call work (flash) then home • 8. Forward call to another number (*) • 9. Forward call (flash) to another number • 10. Hang up

  7. Towards automatic interface evaluation • 10 telephone numbers • 814-866-5000 215-654-5785 • 123-654-7890 814-234-9657 • 814-863-5000 740-611-9273 • 412-268-3000 101-010-1010 • 606-193-3012 103-273-1029 • and 3 other tasks • Forward, redial, conference call

  8. Methods of evaluation • Possible tools • Cognitive architectures • ACT-R/PM • Generic Simulated Eyes and Hands • Focused analysis methods

  9. Possible tools • Ivory’s tools to evaluate websites (2001) • Apex (M. Freed 1998) and iGen (Emmerson 2000) model complex tasks • Glean (Kieras et al 1995) evaluates Lisp interfaces • Shortcomings: no learning, no visual search, tied to a specific interface format, no cognitive theory

  10. Cognitive architectures • Unified theory of cognition (Newell 1990) • Simulate human behavior • Perceptual and motor capability (simulated eyes and hands) • Can do visual search, click buttons, sometimes learn

  11. Cognitive architectures (examples) • EPIC (Kieras and Meyer 1997) - has visual search and perceptual/motor skills… but only evaluates Common Lisp interfaces • Soar (Newell 1990) - also has visual search, perceptual motor skills, plus learning… but only evaluates Tcl/Tk interfaces (or requires a socket connection) • ACT-R/PM (Anderson & Lebiere 1998, Byrne 2001) - nearly identical benefits and limitations as EPIC, plus has learning

  12. ACT-R/PM • Why did we choose ACT-R/PM? • Well-accepted cognitive architecture • Used in past to evaluate interfaces • Can overcome the “Lisp interface-only” problem with generic eyes and hands

  13. Generic Simulated Eyes and Hands • Segman (St. Amant & Riedl 2001) can parse a Windows screen capture and determine the interface components • Can use interfaces written in Lisp, Tcl/Tk, HTML, Visual C++, ... • Segman can be connected to ACT-R/PM

  14. Focus of analysis • A - Analytical model (Fitts’ Law) • B - Cognitive model (ACT-R/PM) • C - Human data

  15. General experiment design • Analytical model, cognitive model, and human users interact with same interfaces • Analytical model dials each number once on each phone, does not do other tasks • Cognitive model: Dialed each phone number 50 times on each phone, performed other phone tasks 50 times on each phone. • Human users (N=9): Dialed each phone number on each phone, performed other phone tasks once on each phone

  16. General experiment design • Experimental software

  17. General experiment design • Cognitive model and users • Timing and mouse-click logging • Eye-tracking • Users can control pace of trials, model does not “care” • Analytical model • Does not need to “see” telephones • Mathematical formula with pixel-level input yields “reaction times”

  18. A. Fitts’ Law analysis • What is Fitts’ Law? • Numerical analysis • Simple conclusions and problems

  19. What is Fitts’ Law? • Fitts’ Law (two possible forms): • MT = a + b * LOG2(2 * D/W) (Fitts 1954) • MT = max(tm, k * LOG2[0.5 + D/W]) (Card et al, 1983) • MT is mouse movement time • D is distance to target, W is target width • a, b, k are constants • tm is minimum movement time

  20. Numerical analysis • Collected pixel-level input about telephones (size and location of buttons) • Dialing a phone requires 10 movements • Total the times from the 10 movements and a base dialing time is established (with no visual search!)

  21. Numerical analysis • Validating our choice of sample telephone numbers (R2 = 0.96)

  22. Simple conclusions and problems • Fitts’ Law analysis is fast (it is just an equation!) • Does not consider many factors • Not affected by any aspect of interface design other than button sizing and spacing

  23. B. ACT-R/PM model analysis • Description of model • Visual search predictions • ACT-R/PM makes different reaction time conclusions

  24. Description of ACT-R/PM model • Model has three main components that can operate in parallel: • retrieve a phone digit from memory • visually search for the digit • move the mouse/click on a digit (governed by Fitts’ Law) • Composed of 71 production rules (mostly for visual search)

  25. Description of ACT-R/PM model • Visual search strategy: random or systematic • One production for random search • Find-random-target IF the goal is to find a phone target THEN find a visual object of type textwhich has not been attended lately

  26. Description of ACT-R/PM model • Sixty productions for systematic search • Systematic-search-from-target IF a digit x is in the visual buffer AND the goal is to find a target y AND y is in direction z from x THEN find a visual object of type text in direction z from target x which is within the bounds of the keypad

  27. Visual search predictions • Count fixations and note fixation locations • Search for the keypad is random • Search within the keypad is systematic • The telephones do not generally require a statistically significant different number of fixations to dial (about 16) • (The telephone numbers are significantly different)

  28. Visual search predictions • Model trace

  29. Visual search predictions Phone 4 Phone 9 What’s wrong with this picture?

  30. Visual search predictions • Two phones are predicted to have abnormally long visual searches • These phones require approximately sixty fixations (average on others was sixteen) • Phone 4 has an upside-down keypad -- the systematic search fails! • Phone 9 contains extra information on the buttons… distracts the visual search • We will see the model takes much longer than humans to dial these phones

  31. ACT-R/PM makes different reaction time conclusions • This is no surprise - more factors are being considered • Phones 4 and 9 pay a large visual search penalty • Fitts’ Law still a factor - phones with “Fitts’ Law violations” still perform worse

  32. ACT-R/PM makes different reaction time conclusions

  33. ACT-R/PM makes different reaction time conclusions • The phones are often shown to have different dialing times (T-test, p<.05) • The significance level of the differences depends on the telephone number being dialed • On average, approximately 8.7 seconds to dial a telephone. • Never faster than six seconds • No errors!

  34. ACT-R/PM makes different reaction time conclusions • Model is able to perform additional tasks (redial, forward, conference) with a random search • Model does not always succeed but never gives up • Will attend the same visual target several times

  35. C. User data analysis • Where and how users look (eye-tracking) • Humans make errors • Summary of user reaction times

  36. Where and how users look • Fast random search for keypad • Systematic search within keypad

  37. Where and how users look • User trace

  38. Where and how users look • Users require approximately the same number of fixations per telephone as the model did (also true for telephone numbers) • User able to cope with phones 4 and 9 by changing search strategy • Phone 4: “Up is down, down is up” • Phone 9: Ignore ABCs on the keypad

  39. Where and how users look • Fixation comparison across numbers (R2 = 0.11)

  40. Where and how users look • Fixation comparison across 8 phones (R2 = 0.34)

  41. Humans make errors • Errors not predicted by the automatic analyses • Depend on several factors • Number being dialed • Dialing speed (weak correlation) • Interface being used

  42. Errors dependent on interface • Most errors on “Fitts’ Law violators” • Least errors when large and adjacent buttons • Users will move mouse while clicking (ACT-R/PM will not), this can cause errors • Possible to estimate number of errors with Fitts’ “index of difficulty”?

  43. Summary of reaction times • User on average more than one second faster than model • This probably due to efficient pipelining of motor tasks (room for ACT-R/PM improvement) • Users can dial as fast as 3.5 seconds (average is seven seconds)

  44. Summary of reaction times • Model (R2 = 0.41), Fitts’ (R2 = 0.85), user dial time across phones

  45. Summary of reaction times • Users can do other phone tasks faster than ACT-R/PM • Users can find the target under varied conditions • Users try more strategies to find target • Users will give up if they can’t succeed!

  46. Summary of reaction times • Model vs user on extra tasks (R2 = 0.60, 0.26, 0.11)

  47. Summary of reaction times • User data also shows that the interfaces are often significantly different (p <.05), though less often than the model says • User time differences also depend on the number being dialed • Theory: users less affected by additional interface objects than ACT-R/PM

  48. Comparison of analyses • Analytical model is not enough • Visual search differences between ACT-R/PM and users • ACT-R/PM and Segman need better representation of interfaces • Cognitive models can make more complicated predictions • ACT-R/PM model is generally slower than users

  49. Further work • Cellular phones • This analysis does not work “out of the box” for cellular phones • These phones have different tasks! (Golightly 2003) • Hutchinson 3G UK phone task (Golightly 2003) • Analysis of menu controls for cellular phone menus, included analytical model • Interface became easier to use when more directional controls were provided

  50. Further work • Analyzing ten additional designs • Easy if you use existing automatic models! • Fifteen minutes for Fitts’ Law analysis • Forty-five minutes for 500 model runs • Hard if you test with actual users! • Can take weeks to get scheduled • Humans miss appointments

More Related