430 likes | 567 Views
Comprehensibility of Model Representations. Designing an Empirical Evaluation Framework Jorge Aranda in collaboration with Neil Ernst, Jennifer Horkoff, and Steve Easterbrook August, 2006. Related Ideas (Monday). Ed Brinksma – (1 st definition) Models for communication. Qualities of models
E N D
Comprehensibility of Model Representations Designing an Empirical Evaluation Framework Jorge Aranda in collaboration with Neil Ernst, Jennifer Horkoff, and Steve Easterbrook August, 2006
Related Ideas (Monday) • Ed Brinksma – (1st definition) Models for communication. Qualities of models • Frits Vaandrager – Used Uppaal in part because he wanted something “easy to understand by engineers” • Andreas Bauer – Pain of developing error-free, human readable temporal specs • Jan Friso Groote, Egon Börger – Models act as communication mechanisms between designers and domain experts
Comprehensibility of Model Representations Designing an Empirical Evaluation Framework Jorge Aranda in collaboration with Neil Ernst, Jennifer Horkoff, and Steve Easterbrook August, 2006
If there are so many languages… • Literally hundreds of conceptual modelling languages • Static • Dynamic • Intentional • Social • Argumentation • Beliefs • Meta • …
…why is nobody using them? • Fine, ‘nobody’ is too strong • But from what I see, when faced with real projects • My colleagues do not use them • My professors do not use them • My tutored students do not use them • (if their grade doesn’t depend on it!) • Most software houses I know of do not use them • Most big software corporations I know of do not use them • Frankly, I don’t use them either • At least we could agree that the offer and the demand are mismatched
Why is nobody using my typewriter? • Looks complicated • Not intuitive • I’m used to handwriting • Training • All it takes is experts showing off what they can do for everyone to want to join the party • Put up with the learning curve to reach the expert’s level
Why is nobody using my bicycle? • Very uncomfortable! • Painful landings • Roads are not appropiate • Evolution • Careful refinement • Trial and error • Context (roads) evolves along with artifact (bike)
Kial neniu uzi mia lingvo? • Esperanto for “why does nobody use my language?” • I think… • Why bother? • English is good enough • Klingon is geekier • Only useful in Esperanto conventions! • Mostly hopeless • Though I guess you have to roll it out to see what happens
Why is nobody using my modelling language? • Lack of training? • Adapt the user • Our favourite excuse • “Dumb software engineers, they don’t know what they’re missing!” • “In time it will spread like wildfire” • Needs refinement? • Adapt the tool • Identify, eliminate weaknesses • Does it do what it should do? • What is its problem? • Useless proposal? • After all, architects, electrical engineers and mathematicians are not asking these questions!
Why use models? • Why would we use models? • Communication • Exploration and reflection (learning) • Other uses? • Storage of information • Model-Driven Development • Model Checking • Getting process-heavy managers off our backs • Models / diagrams are used basically as communication artifacts
Models as Communication Artifacts • Qualities of communication artifacts • Effort to codify / decodify • Obsolescence • Learning curve • … • For our study, we’re interested in the comprehensibility quality of models • There’s no point in using a communication artifact if it is incomprehensible…
Pinning down “comprehensibility” • Intuitive questions of comprehensibility • Can I make sense of the document? • Can I make sense of it quickly? • Can I make sense of it accurately? • Can I make sense of it better than with other approaches?
Comprehensibility Issues:Defining Comprehensibility • Very tricky construct! • Several dimensions • “Right responses” in a quiz • Confidence in those responses • Time taken to read document • Retention of information • Memorizing information vs Using information • Not so interested in comprehensibility of the whole as in comprehensibility of the fragments, ‘idioms’, and elements • Find out what works and what doesn’t • Quantitative / qualitative dilemma • Nuanced concept, richer results available only qualitatively… • …but we would love a straight ‘better or worse’ answer!
Comprehensibility Issues:Relevance of Context • Relevance of reader context • Como hablo español, puedo entender esta frase perfectamente • Is the reader a language expert? • Is the reader a domain expert? • Relevance of fit between problem and language • Good luck modelling stakeholders’ goals with class diagrams • Relevance of problem size and complexity • We get bogged down at different rates depending on our language choice
Explicit vs. Implicit:“models” and “diagrams” • I’ve been using the terms interchangeably! • Academic view • Models are not essentially diagrammatic • Diagrams are only used to express the information in the model • No need to restrict ourselves to a particular visual representation • You’re here if you think syntax is sugar • Practitioner’s view • Diagram and model are intertwined • Information conveyed through proximity, symmetry, and other perceptual patterns • Diagrammatic form is the one the team shares, discusses, and understands • The medium is (also) the message
Example • Both diagrams represent the same model, a directed graph • In Larkin & Simon’s terms, they are informationally equivalent • However, a human reader will interpret them differently • The left diagram represents centrality, the right diagram hierarchy • It may be impossible to make explicit all the information in a diagram • It may even be impossible to agree on what information is represented in a diagram!
Evaluating Comprehensibility:State of the Art? • Three problems with comprehensibility evaluations: • Scarcity • Validity • Theoretical background
Evaluating Comprehensibility:Scarcity of Evaluations • Most modelling languages (and extensions) are never evaluated… • Third party evaluation almost non-existent • More popular languages do get their share of studies • ER diagrams • DFDs • (Some) UML diagrams • …which means that, for most languages, we’re stuck with version 1.0
Evaluating Comprehensibility:Validity Problems • Contrived examples • “Unfair” comparisons • One model providing more essential information than the other • Languages taken out of their fields • Disregard for context • Problem size, complexity, expertise of participants • Missing information • Replications become difficult / impossible • CAVEAT: I’m generalizing very broadly!
Evaluating Comprehensibility:Theoretical background issues • Comparing just for the sake of it • No satisfactory explanations for study results! • Comparing apples and oranges • E.g., a comparison between a Z specification and its implementation (in code!) • Importing theories of questionable portability • Most popular offender: Larkin & Simon’s “Why a Diagram is (Sometimes) Better Than Ten Thousand Words” • Examples of essentially spatial nature • Geometry and classic physics • Benefits for Software Engineering constructs far less clear
Designing an empirical study • Strategy: • Choose modelling language(s) • Base evaluation on the language’s theory (if any) • Language claims become study hypotheses • Inform hypotheses with Cognitive Science theory • Design and execute study • Improve these guidelines (iterate)
Designing an empirical study • Base evaluation on language’s theory • What is this language supposed to be good at? • What should it not be used for? • Who is supposed to use it? • Does it claim to have specific benefits? • Is it supposed to be used with a particular tool? • The benefits claimed become study hypotheses • But many language proponents never articulate these things! • We’ll have to fill in the blanks at some points
Designing an empirical study • Analysis based on relevant theories informs hypotheses • External cognition (Scaife & Rogers) • Computational offloading • Reducing cognitive effort by putting knowledge in the world rather than in the head • Re-representation • Some representations make problem solving easier (43 x 10) than others (XLIII x X) • Graphical constraining • If a diagram reduce the number of inferences I can make, it enhances my power on the remaining inferences • Cognitive dimensions framework • Visibility • Ability to view components easily • Hidden dependencies • Important links between entities are not visible • Abstraction • Types and availability of abstraction mechanisms • …
Designing an empirical study • Experimental design • Choose domains: • Natural • For which we can get a lot of participants • Differentiate participants (developers, analysts, users) • Consider domain and language experience • Models created by language experts • Gets the ‘flawed model’ threat out of the way
Designing an empirical study • Experimental design (cont) • Quantitative data through tests • Right answers • Certainty • Time taken to respond • Functional use of information (e.g., fixing errors) • Results compared to control group • Qualitative data through observation • Vocalizations • Notes in study materials, scratch paper
Designing an empirical study • Example: Evaluating i* • What is i*? • Represents social-, dependency-, and goal-related information • Based on actor’s goals, tasks and dependencies to other actors
Designing an empirical study • Example: Evaluating i* • i* information is rarely explicitly stated in software projects • Closest alternative: problem statement • i* is not supposed to be used instead of problem statement, but as a complement • Domains: New features for a webmail system, new features for a web browser • Advantage: Many available participants (students, not necessarily CS!) are domain experts • Participants: Only domain experts for now • Leave developers and analysts for later studies
Back to the framework • Big question: Can we extract the essence of these evaluations and apply them systematically? • That is, make them more benchmarks than ad hoc tests? • I don’t think so • Too many subtle distinctions between languages • But we can develop a framework that… • …guides our theoretical analyses • …guides our empirical studies designs • …helps to make sense of the available literature
Back to the framework • Framework should evolve (just as modelling languages!) • Roughly outlined in previous slides • First test is our i* evaluation • Move on to other languages • UML is a tempting target • Not constrained to diagrammatic representations • Could evaluate spec templates, “user stories”, any other communication artifact
Any results yet? • No, sorry! • We expect to run our first study during the Fall this year • More studies to come early next year • Final goal is a stable framework –a couple of years down the road