1 / 12

Evaluation

Explore formative and summative evaluation in software development, considering types of evaluation, reasons for evaluation, and classification of evaluation methods. Learn about observation, monitoring, experimentation, and collecting user opinions to enhance design processes.

moorethomas
Download Presentation

Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluation

  2. Evaluation • There are many times throughout the lifecycle of a software development that a designer needs answers to questions that check whether his or her ideas match with those of the user(s).Such evaluation is known as formative evaluation because it (hopefully) helps shape the product. User-centred design places a premium on formative evaluation methods. • Summative evaluation, in contrast, takes place after the product has been developed.

  3. Context of Formative Evaluation • Evaluation is concerned with gathering data about the usability of a design or product by a specific group of users for a particular activity within a definite environment or work context. • Regardless of the type of evaluation it is important to consider • characteristics of the users • types of activities they will carry out • environment of the study (controlled laboratory? field study?) • nature of the artefact or system being evaluated? (sketches? prototype? full system?)

  4. Reasons for Evaluation • Understanding the real world • particularly important during requirements gathering • Comparing designs • rarely are there options without alternatives • valuable throughout the development process • Engineering towards a target • often expressed in the form of a metric • Checking conformance to a standard

  5. Classification of Evaluation Methods • Observation and Monitoring • data collection by note-taking, keyboard logging, video capture • Experimentation and Benchmarking • statement of hypothesis, control of variables • Collecting users’ opinions • surveys, questionnaires, interviews • Interpreting situated events • Predicting usability

  6. Observation and Monitoring - Direct Observation Protocol • Usually informal in field study, more formal in controlled laboratories • data collection by direct observation and note-taking • users in “natural” surroundings • “objectivity” may be compromised by point of view of observer • users may behave differently while being watched (Hawthorne Effect) • ethnographic, participatory approach is an alternative

  7. Observation and Monitoring - Indirect Observation Protocol • data collection by remote note taking, keyboard logging, video capture • Users need to be briefed fully; a policy must be decided upon and agreed about what to do if they get “stuck”; tasks must be justified and prioritised (easiest first) • Video capture permits post-event “debriefing” and avoids Hawthorne effect (However, users may behave differently in unnatural environment) • with data-logging vast amounts of low-level data collected; difficult and expensive to analyse • interaction of variables may be more relevant than a single one (lack of context)

  8. Experimentation and Benchmarking • “Scientific” and “engineering” approach • utilises standard scientific investigation techniques • Selection of benchmarking criteria is critical…and sometimes difficult (e.g., for OODBMS) • Control of variables, esp. user groups, may lead to “artificial” experimental bases

  9. Collecting User’s Opinions • Surveys • critical mass and breadth of survey are critical for statistical reliability • Sampling techniques need to be well-grounded in theory and practice • Questions must be consistently formulated, clear and not “lead” to specific answers

  10. Collecting User’s Opinions - Verbal Protocol • (Individual) Interviews • can be during or after user interaction • during: immediate impressions are recorded • during: may be distracting during complex tasks • after: no distraction from task at hand • after: may lead to misleading results (short-term memory loss, “history rewritten” etc.) • can be “structured” or not • a structured interview is like a personal questionnaire - prepared questions

  11. Collecting Users Opinions • Questionnaires • “open” (free form reply) or “closed” (answers “yes/no” or from a wider range of possible answers) • latter is better for quantitative analysis • important to use clear, comprehensive and unambiguous terminology, quantified where possible • e.g., daily?, weekly?, monthly? Rather than “seldom”, “often” and there should always be a “never” • Needs to allow for “negative” feedback • All Form Fillin guidelines apply!

  12. Relationship between Types of Evaluation and Reasons for Evaluation Observing and Monitoring Users’ Opinions Experiments etc. Predictive Interpretive Y Understanding Real World Y Y Y ComparingDesigns Y Y Y Y Y Engineering to target Y Y Y Y Y Y Standards conformance Y Y Y Y Y

More Related