CS 236501 Introduction to AI

CS 236501Introduction to AI Tutorial 3 Empirical evaluation in AI

Agenda • Importance of empirical evaluation in AI • Empirical evaluation guidelines • Designing and running an experiment • Experimental results presentation • Discussing the experimental results Intro. to AI – Tutorial 3 – By Nela Gurevich

Empirical evaluation in AI • Empirical = Exploratory + Experimental • We wish to explain the behavior of an algorithm • We conduct experiments to prove our explanation • Sometimes, the opposite occurs – we make conclusion about an algorithm behavior from the results of an experiment • Empirical evaluation is an important part in AI research • Many algorithms do not have a sound theoretical proof for their properties Intro. to AI – Tutorial 3 – By Nela Gurevich

Empirical evaluation in AI • We use experiments to prove our theories • It is important to design experiments properly • It is important to clearly present our experimental results • Clear presentation and explanation of experimental results are a major part of any AI research and of this course Home Assignments! • We will guide you in the home assignments Intro. to AI – Tutorial 3 – By Nela Gurevich

Designing an experiment:first step • What would you like to do? • Examine the effect of a certain parameter on an algorithm performance • Example: Beam search with beam width = k We would like to examine the effect of k on algorithm performance • Compare the performance of two (or more) algorithms • Example: DFS vs. BFS on Tic-Tac-Toe puzzle Intro. to AI – Tutorial 3 – By Nela Gurevich

Parameter effect on algorithm performance • Fix all algorithm parameters, except the one that is inspected • Example: Effect of beam width (k) on Beam Search Fix the problem difficulty • Run the experiment for different values of the inspected parameter • Example: Run Beam Search with k = {1, 5, 10, 50, 100, 500, 1000} – small, medium and large values • Important: choosing the parameter values range correctly • K = {1, 5, 7} or {k = 1001, 1005, 1007} is a bad decision • The range of the values may vary for different problems Intro. to AI – Tutorial 3 – By Nela Gurevich

Comparing two (or more) algorithms • Run two algorithms with fixed parameters • For proper comparison, avoid all possible differences • Example: DFS vs. BFS on Tic-Tac-Toe • Run the algorithms on problems of the same difficulty, or even on the same problem • Example: Comparing two heuristics • Use the same search algorithm with the same parameters • Use problems of the same difficulty (or same problems) Intro. to AI – Tutorial 3 – By Nela Gurevich

Designing an experiment:Random Elements • Sometimes random elements affect the performance of an algorithm, for example: • Random initial problems • In Hill-Climbing search, when two successors have the same heuristic value, one is chosen at random • In such cases, the experiment should be repeated more than once, and results should be averaged • Important: initialize the random seed in your program, to avoid repeating the same “random” experiment over and over • srand( (unsigned)time( NULL ) ); // in C Intro. to AI – Tutorial 3 – By Nela Gurevich

Running an experiment • Trace the experiment • For debugging purposes • Record all data that seems important • Saves time later (no need to re-run experiments to get the needed data) • Example: DFS Record number of expanded nodes, execution time, whether solution was found, solution length. • Use batch (script) files • Easy to reproduce results • Save time near computer Intro. to AI – Tutorial 3 – By Nela Gurevich

Presenting experimental results • Use tables or graphs – visual methods, easy to understand • First of all, decide what is the purpose of the table/graph you want to present Intro. to AI – Tutorial 3 – By Nela Gurevich

Presenting experimental results • Example: I run Beam Search on n random problems. Beam Search is not complete: sometimes it finds a solution to a problem, and sometimes it does not. I would like to present a graph that shows the effect of the beam width (k) parameter in Beam Search on the percentage of problems for which solution is found. Intro. to AI – Tutorial 3 – By Nela Gurevich

Presenting experimental results – common guidelines • Graphs/tables should be very easy to understand • No additional mathematical calculations should be required to understand the graph/table X Table1: Solution length found by Alg1 and Alg2 Intro. to AI – Tutorial 3 – By Nela Gurevich

Presenting experimental results – common guidelines • Avoid combining large amount of data in one graph/table X Intro. to AI – Tutorial 3 – By Nela Gurevich

Presenting experimental results:graphs • Name the axis • Scale the axis properly • Use graphs for continuous values • Use graphs for discrete values Intro. to AI – Tutorial 3 – By Nela Gurevich

Problem Alg 1 2 3 4 5 Avg Alg1 100 140 200 90 20 110 Alg2 30 40 210 95 60 87 Presenting experimental results:tables • Name the data columns properly • Specify the measurement units used in the table • Use short tables • When long, detailed tables need to be attached, attach them as appendices • Summarize with averages, when needed Table1: Solution length found by Alg1 and Alg2 Intro. to AI – Tutorial 3 – By Nela Gurevich

Discussing the experimental results • It is essential to discuss the obtained experimental results in words ! • Verbal discussion with visual presentation of the results is the main part of the empirical evaluation of algorithms Intro. to AI – Tutorial 3 – By Nela Gurevich

Discussing the experimental results: guidelines • Support every conclusion you make with data that proves this conclusion • Insert graphs/tables in appropriate places in the text for easy reference • Use short and clear sentences • Avoid using many adjectives • Avid combining many conclusions in one sentence Intro. to AI – Tutorial 3 – By Nela Gurevich

Discussing the experimental results: guidelines • Not always the conclusion is that Algorithm1 is clearly better than Algorithm2 on a given problem • Discuss advantages/disadvantages of each algorithm • Compare the different performance elements of the algorithms • Example: Alg1 expands twice as many nodes as Alg2, on average, Alg1 and Alg2 take the same time to run on average • Not always the conclusion from an experiment is that Algorithm1 is better than Algorithm2 in general • Discuss how the algorithms are affected by the problem on which they are tested Intro. to AI – Tutorial 3 – By Nela Gurevich

Summary • Designing and running an experiment should be done carefully • Present experimental results with graphs and tables. It is important that the visual presentation of the results is clear to the reader • Findings should be summarized with a verbal discussion of the experimental results, backed up with visual presentation of results Intro. to AI – Tutorial 3 – By Nela Gurevich

Final Tip • Print homework on both sides of the paper – save trees Intro. to AI – Tutorial 3 – By Nela Gurevich

CS 236501 Introduction to AI