360 likes | 385 Views
Introduction (Outline). The Software Development Process Performance Analysis: the Big Oh. Abstract Data Types Introduction to Data Structures. The Software Development Process. Software Development. Requirement analysis, leading to a specification of the problem Design of a solution
E N D
Introduction(Outline) • The Software Development Process • Performance Analysis: the Big Oh. • Abstract Data Types • Introduction to Data Structures CS 103
Software Development • Requirement analysis, leading to a specification of the problem • Design of a solution • Implementation of the solution (coding) • Analysis of the solution • Testing, debugging and integration • Maintenance and evolution of the system. CS 103
Specification of a problem • A precise statement/description of the problem. • It involves describing the input, the expected output, and the relationship between the input and output. • This is often done through preconditions and postconditions. CS 103
Design • Formulation of a method, that is, of a sequence of steps, to solve the problem. • The design “language” can be pseudo-code, flowcharts, natural language, any combinations of those, etc. • A design so expressed is called an algorithm(s). • A good design approach is a top-down design where the problem is decomposed into smaller, simpler pieces, where each piece is designed into a module. CS 103
Implementation • Development of actual C++ code that will carry out the design and solve the problem. • The design and implementation of data structures, abstract data types, and classes, are often a major part of design implementation. CS 103
Implementation(Good Principles) • Code Re-use • Re-use of other people’s software • Write your software in a way that makes it (re)usable by others • Hiding of implementation details: emphasis on the interface. • Hiding is also called data encapsulation • Data structures are a prime instance of data encapsulation and code re-use CS 103
Analysis of the Solution • Estimation of how much time and memory an algorithm takes. • The purpose is twofold: • to get a ballpark figure of the speed and memory requirements to see if they meet the target • to compare competing designs and thus choose the best before any further investment in the application (implementation, testing, etc.) CS 103
Testing and Debugging • Testing a program for syntactical correctness (no compiler errors) • Testing a program for semantic correctness, that is, checking if the program gives the correct output. • This is done by • having sample input data and corresponding, known output data • running the programs against the sample input • comparing the program output to the known output • in case there is no match, modify the code to achieve a perfect match. • One important tip for thorough testing: Fully exercise the code, that is, make sure each line of your code is executed. CS 103
Integration • Gluing all the pieces (modules) together to create a cohesive whole system. CS 103
Maintenance and Evolution of a System • Ongoing, on-the-job modifications and updates of the programs. CS 103
Preconditions and Postconditions • A semi-formal, precise way of specifying what a function/program does, and under what conditions it is expected to perform correctly • Purpose: Good documentation, and better communications, over time and space, to other programmers and user of your code CS 103
Precondition • It is a statement of what must be true before function is called. • This often means describing the input: • the input type • the conditions that the input values must satisfy. • A function may take data from the environment • Then, the preconditions describe the state of that environment • that is, the conditions that must be satisfied, in order to guarantee the correctness of the function. • The programmer is responsible for ensuring that the precondition is valid when the function is called. CS 103
Postcondition • It is a statement of what will be true when the function finishes its work. • This is often a description of the function output, and the relationship between output and input. • A function may modify data from the environment (such as global variables, or files) • the postconditions describe the new values of those data after the function call is completed, in relation to what the values were before the function was called. CS 103
Example of Pre/Post-Conditions void get_sqrt( double x) // Precondition: x >= 0. // Postcondition: The output is a number // y = the square root of x. CS 103
C++ Way of Asserting Preconditions • Use the library call assert (condition) • You have to include #include <cassert> • It makes sure that condition is satisfied (= true), in which case the execution of the program continues. • If condition is false, the program execution terminates, and an error message is printed out, describing the cause of the termination. CS 103
Performance Analysis • Determining an estimate of the time and memory requirement of the algorithm. • Time estimation is called time complexity analysis • Memory size estimation is called space complexity analysis. • Because memory is cheap and abundant, we rarely do space complexity analysis • Since time is “expensive” , analysis now defaults to time complexity analysis CS 103
Big-O Notation • Let n be a non-negative integer representing the size of the input to an algorithm • Let f(n) and g(n) be two positive functions, representing the number of basic calculations (operations, instructions) that an algorithm takes (or the number of memory words an algorithm needs). CS 103
Big-O Notation (contd.) • f(n)=O(g(n)) iff there exist a positive constant C and non-negative integer n0 such that f(n) Cg(n) for all nn0. • g(n) is said to be an upper bound of f(n). CS 103
Big-O Notation(Examples) • f(n) = 5n+2 = O(n) // g(n) = n • f(n) 6n, for n 3 (C=6, n0=3) • f(n)=n/2 –3 = O(n) • f(n) 0.5 n for n 0 (C=0.5, n0=0) • n2-n = O(n2) // g(n) = n2 • n2-n n2 for n 0 (C=1, n0=0) • n(n+1)/2 = O(n2) • n(n+1)/2 n2 for n 0 (C=1, n0=0) CS 103
Big-O Notation (In Practice) • When computing the complexity, • f(n) is the actual time formula • g(n) is the simplified version of f • Since f(n) stands often for time, we use T(n) instead of f(n) • In practice, the simplification of T(n) occurs while it is being computed by the designer CS 103
Simplification Methods • If T(n) is the sum of a constant number of terms, drop all the terms except for the most dominant (biggest) term; • Drop any multiplicative factor of that term • What remains is the simplified g(n). • amnm + am-1nm-1+...+ a1n+ a0=O(nm). • n2-n+log n = O(n2) CS 103
Big-O Notation (Common Complexities) • T(n)=O(1) // constant time • T(n)=O(log n) // logarithmic • T(n)=O(n) // linear • T(n)=O(n2) //quadratic • T(n)=O(n3) //cubic • T(n)=O(nc), c 1 // polynomial • T(n)=O(logc n), c 1 // polylogarithmic • T(n)=O(nlog n) CS 103
Big-O Notation(Characteristics) • The big-O notation is a simplification mechanism of time/memory estimates. • It loses precision, trading precision for simplicity • Retains enough information to give a ballpark idea of speed/cost of an algorithm, and to be able to compare competing algorithms. CS 103
Common Formulas • 1+2+3+…+n= n(n+1)/2 = O(n2). • 12+22+32+…+n2= n(n+1)(2n+1)/6 = O(n3) • 1+x+x2+x3+…+xn=(x n+1 – 1)/(x-1) = O(xn). CS 103
Example of Time Complexity Analysis and Big-O • Pseudo-code of finding a maximum of x[n]: • double M=x[0]; • for i=1 to n-1 do • if (x[i] > M) • M=x[i]; • endif • endfor • return M; CS 103
Complexity of the algorithm • T(n) = a+(n-1)(b+a) = O(n) • Where “a” is the time of one assignment, and “b” is the time of one comparison • Both “a” and “b” are constants that depend on the hardware • Observe that the big O spares us from • Relatively unimportant arithmetic details • Hardware dependency CS 103
Abstract Data Types CS 103
Abstract Data Types • An abstract data type is a mathematical set of data, along with operations defined on that kind of data • Examples: • int: it is the set of integers (up to a certain magnitude), with operations +, -, /, *, % • double: it’s the set of decimal numbers (up to a certain magnitude), with operations +, -, /, * CS 103
Abstract Data Types (Contd.) • The previous examples belong to what is called built-in data types • That is, they are provided by the programming language • But new abstract data types can be defined by users, using arrays, enum, structs, classes (if object oriented programming), etc. CS 103
Data Structures • A data structure is a user-defined abstract data type • Examples: • Complex numbers: with operations +, -, /, *, magnitude, angle, etc. • Stack: with operations push, pop, peek, isempty • Queue: enqueue, dequeue, isempty … • Binary Search Tree: insert, delete, search. • Heap: insert, min, delete-min. CS 103
Data Structure Design • Specification • A set of data • Specifications for a number of operations to be performed on the data • Design • A lay-out organization of the data • Algorithms for the operations • Goals of Design: fast operations CS 103
Implementation of a Data Structure • Representation of the data using built-in data types of the programming language (such as int, double, char, strings, arrays, structs, classes, pointers, etc.) • Language implementation (code) of the algorithms for the operations CS 103
Object-Oriented Programming (OOP) And Data Structures • When implementing a data structure in non-OOP languages such as C, the data representation and the operations are separate • In OOP languages such as C++, both the data representation and the operations are aggregated together into what is called objects • The data type of such objects are called classes. • Classes are blue prints, objects are instances. CS 103