An Introduction to Programming Concepts and OI-programming

An Introduction to Programming Concepts and OI-programming …from abstract theory to dirty tricks…

Objectives Today • Introduction to the concept of “Algorithms” • Introduction to complexity • “Philosophy” of OI competitions • “OI-style” programming

What is an Algorithm? • From Wikipedia: An algorithm is a finite set of well-defined instructions for accomplishing some task which, given an initial state, will terminate in a corresponding recognizable end-state. • (what does that mean?) • Usually, an algorithm solves a “problem”. • Examples • Insertion sort • Binary Search • An algorithm does not have to be a computer program! Think about other possible algorithms in real life

“Problem”s • Usually a set of well defined inputs and corresponding outputs • Example: the sorting problem: • Input: a list of numbers • Output: a sorted list of numbers • There can be multiple algorithms that solves the same problem • e.g. Bubble Sort, Bogosort

Examples of algorithms • Sorting algorithms • Graph algorithms – Djikstra, Warshall-floyd, Bellman-Ford, Prims, Kruskal • Tree-Search algorithms – BFS, DFS • Linear Searching Algorithms

Examples of Techniques in Designing Algorithms • Recursion • Dynamic programming • Greedy • Divide and conquer • Branch and bound • (the above may have overlaps)

Using and Creating Algorithms “It is science. You can derive them.”“It is art. We have no way to teach you!” • Why study algorithms? • To solve problems that can be directly solved by existing algorithms • To solve problems that can be solved by combining algorithms • To get feelings and inspirations on how to design new algorithms

Related Issues • Proving correctness of algorithms • Can be very difficult • Disproving is easier  • All you need is just one counterexample

Complexity • An approximation to the runtime and memory requirement of a program. • We don’t really care about the exact numbers (why?) • In most cases, we concern runtime only • Note that there are “best-case”, “average-case”, and “worst case” complexity • Usually we look at worst case only • We want to know how well an algorithm “scales up” (i.e. when there is a large input). Why?

Complexity (cont’d) • Here’s why:

Quasi-Formal Definition of Big-O • (you need not remember these) We say f(x) is in O(g(x)) if and only if there exist numbers x0 and M such that |f(x)| ≤ M |g(x)| for x > x0

Example 1 – Bubble sort • For i := 1 to n do For j := i downto 2 do if a[j] > a[j-1] then swap(a[j], a[j-1]); • Time Complexity? O(n2) • How about memory?

Example 2 – Insertion Sort • Quick introduction to insertion sort (you will learn more in the searching and sorting training): • [] 4 3 1 5 2 • [4] 3 1 5 2 • [3 4] 1 5 2 • [1 3 4] 5 2 • [1 3 4 5] 2 • [1 2 3 4 5] • Time Complexity = ?

Applications • Usually, the time complexity of the algorithm gives us a rough estimation of the actual run time. • O(n) for very large N • O(n2) for n ~ 1000-3000 • O(n3) for n ~ 100-200 • O(n4) for n ~ 50 • O(kn) or O(n!) for very small n, usually < 20 • Keep in mind • The constant of the algorithms (including the implementation) • Computers vary in speeds, so the time needed will be different • Therefore remember to test the program/computer before making assumptions!

Problem • I have implemented bubble sort for an Array A[N] and applied binary search on it. • Time complexity of bubble sort? • O(N2). No doubt. • Time complexity of binary search? • O(lg N) • Well, what is the time complexity of my algorithm?

Properties • O(f) + O(g) = max(O(f), O(g)) • O(f) * O(g) = O(fg) • So, what is the answer regarding to previous question?

Some other notations (optional) • (Again no need to remember them) • f(N) is Θ(g(N)) • iff f(N) is O(g(N)) and g(N) is O(f(N)) • f(N) is o(g(N)) • For all C, there exists N0 such that |f(N)| < C|g(N)| for all N > N0 • f(N) is Ω(g(N)) • iff g(N) is O(f(N))

Difficulty of Problem • You only need to have a rough idea about this… • Definitions (not so correct) • A problem with order being a polynomial is called polynomial-time solvable (P) • A problem whose solution is verified in polynomial time is said to be polynomial-time verifiable (NP) • A problem with no known polynomial-time solution to date is called NP-hard • Difficulty of problems are roughly classified as: • Easy: in P (of course all P problems are also in NP) • Hard: in NP but not in P (NP-complete) • Very Hard: not even in NP

“Philosophy” of OI Competitions • Objective of Competition… • The winner is determined by: • Fastest Program? • Amount of time used in coding? • Number of Tasks Solved? • Use of the most difficult algorithm? • Highest Score? • Therefore, during a competition, aim to get highest score, at all costs –“All is fair in love and war.”

Scoring • A “black box” judging system • Test data is fed into the program • Output is checked for correctness • No source code is manually inspected • How to take advantage (without cheating of course!) of the system?

The OI Programming Process • Reading the problems • Choosing a problem • Reading the problem • Thinking • Coding • Testing • Finalizing the program

Reading the Problem • Usually, a task consists of • Title • Problem Description • Constraints • Input/Output Specification • Sample Input/Output • Scoring

Reading the Problem • Constraints • Range of variables • Execution Time • NEVER make assumptions yourself • Ask whenever you are not sure • (Do not be afraid to ask questions!) • Read every word carefully • Make sure you understand before going on

Thinking • Classify the problem • Graph? Mathematics? Data Processing? Dynamic Programming? etc…. • Some complicated problems may be a combination of the above • Draw diagrams, use rough work, scribble… • Consider special cases (smallest, largest, etc) • Is the problem too simple? • Usually the problem setters have something they want to test the contestants, maybe an algorithm, some specific observations, carefulness etc. • Still no idea? Give up. Time is precious.

Designing the Solution • Remember, before coding, you MUST have an idea what you are doing. If you don’t know what you are doing, do not begin coding. • Some points to consider: • Execution time (Time complexity) • Memory usage (Space complexity) • Difficulty in coding • Remember, during competition, use the algorithm that gains you most score, not the fastest/hardest algorithm!

Coding • Optimized for ease of coding, not for reading • Ignore all the “coding practices” outside, unless you find them particularly useful in OI competitions • No Comments needed • Short variable names • Use less functions • NEVER use 16 bit integers (unless memory is limited) • 16 bit integer may be slower! (PC’s are usually 32-bit, even 64 bit architectures should be somewhat-optimized for 32 bit)

Coding • Feel free to use goto, break, etc in the appropriate situations • Never mind what Djikstra has to say  • Avoid using floating point variables if possible (eg. real, double, etc) • Do not do small (aka useless) “optimizations” to your code • Save and compile frequently • See example program code…

Testing • To make sure our program works as expected • This is a very important step, yet mostly overlooked by contestants

Why Testing? • Which of the following is more frustrating? • You have completely no idea on a difficult problem • You know the solution of a difficult problem, spend hours to code it, but there is a stupid bug that you fail to notice, you get 0 marks in the end • Well, the second case is pretty common

Why Testing? • In all OI competitions, you submit a program before competition ends. • Submissions are not judged until the end of competition • There is no “take two”, no chance to correct any mistakes

Testing • Sample Input/Output“A problem has sample output for two reasons: • To make you understand what the correct output format is • To make you believe that your incorrect solution has solved the problem correctly ” • Manual Test Data • Generated Test Data (if time allows) • Boundary Cases (0, 1, other smallest cases) • Large Cases (to check for TLE, overflows, etc) • Tricky Cases

Debugging • Debugging – find out the bug, and remove it • Easiest method: writeln/printf/cout • It is so-called “Debug message” • Use of debuggers: • FreePascal IDE debugger • gdb debugger

Finalizing • Check output format • Any trailing spaces? Missing end-of-lines? (for printf users, this is quite common) • better test once more with sample output • Remember to clear those debug messages • Check I/O – filename? stdio? • Check exe/source file name • Is the executable updated? (If exe has to be submitted) • Method of submission? • Try to allocate ~5 mins at the end of competition for finalizing

Interactive Tasks • Traditional Tasks • Give input in one go • Give output in one go • Interactive Tasks • Your program is given some input • Your program gives some output • Your program is given some more input • Your program gives more output • …etc

Example • “Guess the number” • Sample Run: • Judge: I have a number between 1 and 5, can you guess? • Program: is it 1? • J: Too small • P: 3? • J: Too small • P: 5? • J: Too big • P: 4? • J: Correct • P: Your number is 4!

Open Test Data • Test data is known • Usually quite difficult to solve • Some need time consuming algorithms, therefore you are given a few hours (i.e. competition time) to run the program • Tricks: • ALWAYS look at all the test data first • Solve by hand, manually • Solve partially by program, partially by hand • Some with different programs • Solve all with one program (sometimes impossible!) • Make good use of existing tools – you do not have to write all the programs if some are already available! (eg. sort, other languages, etc)

Tricks • Sometimes, we really have no idea on a problem • Rather than giving up, we may try to squeeze some marks from it • IMPORTANT: Don’t expect too much from this. You don’t deserve to get any marks • Keep in mind that those who know the solution deserve their rewards • Don’t waste time on refining your tricks. Spending more time on other topics is often more rewarding

Some common tricks… • “No solution” • Solve for simple cases • “In 50% of test cases, N < 20” • Special cases (smallest, largest, etc) • Incorrect greedy algorithms • Hard Code • Stupid Hardcode: begin writeln(random(100)); end. • Naïve hardcode: “if input is x, output hc(x)” • More “intelligent” hardcode (sometimes not possible): pre-compute the values, and only save some of them • Brute force • Other Weird Tricks (not always useful…) • Do nothing (e.g.. Toggle)

Competition Environment • Programming Language: Pascal, C, C++ • IDE/Editor: FreePascal IDE, emacs, vi • OS: Windows(?), Linux • What should we use in competitions? • No definite answer, it depends…

Pitfalls / Common Mistakes • Misunderstanding the problem • Not familiar with competition environment • Output format • Using complex algorithms unnecessarily • Choosing the hardest problem first

The End • Note: most of the contents are introductions only. You may want to find more in-depth materials • Books – Introduction to Algorithms • Online – Google, Wikipedia • HKOI – Newsgroup, training websites of previous years, discuss with trainers/trainees. • Training – Many topics are further covered in later trainings • Experience!

Questions?

An Introduction to Programming Concepts and OI-programming