340 likes | 646 Views
CDT403 Research Methodology in Natural Sciences and Engineering . Research Methods and Knowledge Generation in Software Engineering. Gordana Dodig Crnkovic School of Innovation, Design and Engineering Mälardalen University. Popular Views of SE. Popular Views of SE.
E N D
CDT403 Research Methodology in Natural Sciences and Engineering Research Methods and Knowledge Generation in Software Engineering Gordana Dodig Crnkovic School of Innovation, Design and Engineering Mälardalen University
Definig Software Engineering: Computing According to Curricula ‘05 MDH Master's Programme in Software Engineering, 120 creditshttp://www.mdh.se/utbildning/program/master-software-engineering?programCode=ZCS24
Computer Science (CS) Computer science (CS or CompSci) is the scientific and practical approach to computationand its applications. It is the systematic study of the feasibility, structure, expression, and mechanization of the methodical processes (or algorithms) that underlie the acquisition, representation, processing, storage, communication of, and access to information, whether such information is encoded in bits and bytes in a computer memory or transcribed engines and protein structures in a human cell. A computer scientist specializes in the theory of computation and the design of computational systems. http://en.wikipedia.org/wiki/Computer_science
Software Engineering (SE) Software engineering (SE) is the application of a systematic, disciplined, quantifiable approach to the design, development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software . http://en.wikipedia.org/wiki/Software_engineering
Example: Microsoft's Research in SE MS research in SE in in Redmond, USA is organized in five working groups: • Compilers and Runtimes • Empirical Software Engineering • Formal Methods • Program Analysis, and • Programming Languages As you can see, they are not all within typically SE research domain. http://research.microsoft.com/en-us/groups/rise/default.aspx
Characteristics of Software Engineering • New research field, compared with classical sciences like physics, chemistry, biology and even other scholarly disciplines. • Separated from Computer science in 1980:es. • Immature as a scientific and even as engineering field (compared to construction engineering and mechanical engineering which existed already in antiquity and even to electrical engineering that traces back to Volta in 1775). http://en.wikipedia.org/wiki/Electrical_engineering) • Similarly young and immature research and engineering field is bioinformatics.
Research in Software Engineering:Paradigms and Methods • Engineering problems (building new software) Focus on artificial/construction aspects (how things should be) • Scientific problems (studying theoretical basis of SE) Focus on natural aspects of existingphenomena (how things are) http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.1576&rep=rep1&type=pdf MaríaLázaro and Esperanza Marcos
Science, Research, Development and Technology Observe: Not all of research is done within science. Some is part of technology (engineering) and some of development. Software Research Software Technology Software Development Software Science
The relation between Software Science and Software Engineering The relation between Software science and Software engineering can be seen in analogy to the relationships between: Chemistryas science and Chemical engineering Electrodynamics as scienceand Electrical engineering Biology as scienceand Bio-engineering Geology as scienceand Geo-engineering etc.
Research Paradigms • Positivistparadigms (interested in whatexists and how).(In empirical research typical example is software testing.) • Interpretiveparadigms (interested in whyis something the case) (Used in social and cultural problemsand applied to organisation processes in the implantation of software.) • Constructiveparadigms (interested in what can exist and how it can be constructed) (Used in engineering of artifacts).
Research Methods Empirical • Case studies (interviews, observations) • Controlled Experiments (isolate phenomenon of interest, control parameters, parameter variation , reproduce) • Surveys (ask questions to large number representative subjects, statistical analysis of results, also: systematic literature reviews) Analytical • Proof-based (on abstract objects), formal. (This typically works under narrow assumptions, “toy models”) Empirical derives from the Greek empiricos (empirical, experienced; εμπειρικός) from empiria (experience; εμπειρία) from en- (in, with) + pira (experience, trial; πείρα), from the verb pirao (make an attempt, try, test, get experience, endeavour). Analytical derives from Greek analytikos "analytical," from analytos "dissolved” analysis (n.) "resolution of something complex into simple elements" (opposite of synthesis)
Quantitative Methods in Empirical SE https://sites.google.com/site/atulkg/courses/quantitive-methods-in-software-engineering • Assessment in Software Engineering • Software Measurement • Surveys • Case Studies • Controlled Experiments • Design of Experiments • SimulationMethods • Data Collection and Analysis, Statistical Data Analysis, Validity and Interpretation • Planning, Designing, Conducting Empirical Studies • Replication, Documentation, Examples Quantitative "having quantity," from Latin quantus "of what size?” “how much?” “how great?” what amount? Also meaning "measurable"
Qualitative Methods in Empirical SE http://userpages.umbc.edu/~cseaman/papers/tse99.pdf Carolyn B. Seaman, IEEE Empirical studies in SE are beginning to address the human aspects. This added a new layer of complexity. New research methods are needed to study non-technical/human aspects. Qualitative research methods in other research fields are used to handle the complexity of issues involving human behavior. Focus is on qualitative methods for data collection and analysis and in terms of how they might be incorporated into empirical studies of software engineering, in particular how they might be combined with quantitative methods. Qualitative "having quality," from Latin qualitas "a quality, property; nature, state, condition, characteristics, feature, property."
Empirical Studies, a Comparison Quantitative Qualitative Subjective Research questions: What? Why? "Soft" science Develops theory Interpretive Report rich narrative, individual; interpretation. Participants Describes meaning, discovery Patterns and theories developed for understanding Flexible approach: natural setting (process oriented) Sample size is not a concern; "informal rich” sample Searches which bugs are worth counting • Objective • Research questions: How many? • "Hard" science • Test theory • Measurable (in numeric terms) • Report statistical analysis. • Subjects • Establishes relationships, causation • Generalizations leading to prediction • Highly controlled experimental setting • Sample size: n • Counts the bugs https://sites.google.com/site/atulkg/courses/quantitive-methods-in-software-engineering
In the Context of Research Methodology Course • How do we place research in Software science and Software engineering into the framework of our course? • How do we relate to what we learned about knowledge generation, information and computation, evolution and complexity? • What is the connection to the observer dependent scientific theory? • How does it relate to trans-disciplinarity (integrative approaches)? • How about problem of induction/generalisation(Popper) and uncertainty of our knowledge? • Given that software runs on computers, can we relate software to the computationalism and computational models? • What new research methods are there or can be anticipated that might be used in the future?
Science/Engineering as Networks of Knowledge-Generating Agents When we make classifications such as science/engineering, analytical/empirical, quantitative/qualitative, divisions into different sub-disciplines etc., we should have in mind the networked nature of human knowledge. SE is to a high degree integrative research (trans-disciplinary and at times inter-disciplinary, cross-disciplinary, multi-disciplinary). http://www-personal.umich.edu/~ladamic/courses/LadaAdamic: Agent based modeling course
Complexity of Software Systems • Typical for software is addressing complex systems Software systems themselves are typically complex systems • Software science and engineering aims at solving problems that humans have in their relation to the physical world, including other human beings, done by computational means, in this case on the level of abstraction defined by Software Engineering. • Similarity with problems in biology with huge number of variables. for example, there are about 20 000 different proteins that react in different ways. http://lev.ccny.cuny.edu/~hmakse/NETWORK/networks.htmlThe self-similarity of complex networks http://lev.ccny.cuny.edu/~hmakse/NETWORK/shm-nature.pdf
Multiscale Models of Complex Systems Observer-centric model – enhanced resolution where observation is made – where chemical reaction takes place The Nobel Prize in Chemistry 2013 “for the development of multiscale models for complex chemical systems” ... The work of Karplus, Levitt and Warshel is ground-breaking in that they managed to make Newton's classical physics work side-by-side with the fundamentally different quantum physics.Previously, chemists had to choose to use either or. The strength of classical physics was that calculations were simple and could be used to model really large molecules. Its weakness, it offered no way to simulate chemical reactions. For that purpose, chemists instead had to use quantum physics. But such calculations required enormous computing power and could therefore only be carried out for small molecules. This year's Nobel Laureates in chemistry devised methods that use both classical and quantum physics. For instance, in simulations of how a drug couples to its target protein in the body, the computer performs quantum theoretical calculations on those atoms in the target protein that interact with the drug. The rest of the large protein is simulated using less demanding classical physics. Today the computer is just as important a tool for chemists as the test tube. Simulations are so realistic that they predict the outcome of traditional experiments. http://www.nobelprize.org/nobel_prizes/chemistry/laureates/2013/advanced-chemistryprize2013.pdf
Modeling Complex Systems: Multiscale Simulations http://www.wag.caltech.edu/multiscale
Analysis of Big Data Complex systems can be modelled as networks of elements connected with different kinds of relations. http://imdevsoftware.wordpress.com/tag/biochemical-network
Using Software to Improve Software – Automated Discovery “Biology is the area where the gap between theory and data is growing the most rapidly. So it is the area in greatest need of automation.” Software science and Software engineering is similar complex research area. “Generally, the way that scientists design experiments is to vary one factor at a time while keeping the other factors constant, but, in many cases, the most effective way to test a biological system may be to tweak a large number of different factors at the same time and see what happens. ABE (program) will let us do that.” In biology we find majority of examples of automated discovery. http://news.vanderbilt.edu/2011/10/robot-biologist/
Knowledge Generation - Certainty vs. PAC- Probably Approximately Correct Knowledge Leslie Valiant distinguishes between theoryfuland theorylessknowledge. For most things, from finding a mate or managing an economy, our “theories” are inadequate or nonexistent. Valiant proposes computational learning theory based on “ecorithms” - algorithms for adaptive phenomena, based on machine learning as explanation of evolution and cognition. The goal of PAC learning is that, with high probability (the "probably" part), the learned behavior will have low generalization error (the "approximately correct" part). http://en.wikipedia.org/wiki/Probably_approximately_correct_learning http://www.amazon.com/Probably-Approximately-Correct-Algorithms-Prospering/dp/0465032710
PAC Theory of Learning of the Theoryless In Probably Approximately CorrectValiant presents a theory of the theoryless. He argues that in biology, evolution has produced computational natural “machine learning” mechanisms/ecorithms. PAC learningisValiant’s model of how agents can act without need to understand, by recognizing and memorizing patterns. The study of PAC algorithms shows the computational nature of evolution and cognition, and suggests that generally intelligent computers can be constructed. http://www.amazon.com/Probably-Approximately-Correct-Algorithms-Prospering/dp/0465032710
What kinds of SE challenges can we expect in the future?21st Century Challenges for SE • Topic 1: Introduction and fundaments • Theoretical fundaments of software engineering (SE) • Organizational fundaments of SE • Topic 2: Challenges rooted in the requirements stage • Introduction to requirements engineering (RE) and commercial software problems • RE and critical software problems • RE, inconsistencies and viewpoints • Conceptual modelling and problem frames (PFs) • PF-based solutions or conceptual modelling • Topic 3: Safety and accident challenges • Advanced safety concepts and accident models • Solutions: lessons learned systems and interaction-based analysis model • Topic 4: The challenge of critical infrastructures • The problem of critical infrastructures • Problem modelling and analysis in critical infrastructures http://muss.fi.upm.es/en/asigRIS.html
Top 8 Software Engineering Challenges From Top 10 SE challenges by A. Finkelstein: • Relating requirements and architectures. • Moving to evidence-based practice. • Engineering scalability • Addressing semantic divergence. • Accurate estimation of resources. • Engineering the cloud. • Developing adaptive systems. • Rethinking software production http://blog.prof.so/2012/05/top-10-software-engineering-challenges.html
P.S. Just a reminderVerification and Validation (V&V) VERIFICATION: ARE WE BUILDING THE SOFTWARE PRODUCT RIGHT? (According to specification) VALIDATION: ARE WE BUILDING RIGHT SOFTWARE PRODUCT?(What user requires)