1 / 19

Cross-Entropy Based Testing

Cross-Entropy Based Testing. Hana Chockler, Benny Godlin, Eitan Farchi, Sergey Novikov. Research. Haifa, Israel. The problem: How to test for rare problems in large programs?. Testing involves running the program many times, hoping to find the problem.

cindysoto
Download Presentation

Cross-Entropy Based Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cross-Entropy BasedTesting Hana Chockler, Benny Godlin, Eitan Farchi, Sergey Novikov Research Haifa, Israel

  2. The problem:How to test for rare problems in large programs? Testing involves running the program many times, hoping to find the problem. • If a problem appears only in a small fraction of the runs, it is unlikely to be found during random executions. searching for a needle in haystack

  3. The main idea: Use the cross-entropy method! The cross-entropy method is a widely used approach to estimating probabilities of rare events (Rubinstein).

  4. The cross-entropy method - motivation • The problem: • There is a probability space S with probability distribution f and a performance function P defined on it. • A rare event e is that P(s) > r, for some s 2 S and some r • How can we estimate the probability of e? and this happens very rarely under f input in which the rare event e occurs Space S s

  5. The naïve idea Generate a big enough sample and compute the probability of the rare event from the inputs in the sample a huge sample from the probability space This won’t work because for very rare events even a very large sample does not reflect the probability correctly

  6. A wishful thinking: if we had a distribution that gives the good inputs the probability 1, then we would be all set … But we don’t have such a distribution w.r.t. the performance function The cross-entropy method • So we try to approximate it in iterations, every time trying to come a little closer: • In each iteration, we generate a sample of some (large) size. • We update the parameters (the probability distribution) so that we get a better sample in the next iteration.

  7. Formal definition of cross-entropy • In information theory, the cross entropy, or the Kullback-Leibler “distance” between two probability distributions p and qmeasures the average number of bits needed to identify an event from a set of possibilities, if a coding scheme is used based on a given probability distribution q, rather than the "true" distribution p. • The cross entropy for two distributions p and q over the same discrete probability space is defined as follows: H(p,q) = - x p(x) log(q(x)) not really a distance, because it is not symmetric

  8. The cross-entropy methodfor optimization problems [Rubinstein] • In optimization problems, we are looking for inputs that maximize the performance function. • The main problem is that this maximum is unknown beforehand. • The stopping point is when the sample has a small relative standard deviation. • The method was successfully applied to a variety of graph optimization problems: • MAX-CUT • Traveling salesman • …

  9. Performance function Updated distribution Illustration starting point Performance function Uniform distribution

  10. The setting in graphs In graph problems, we have the following: The space is all paths in the graph G A performance function f gives each path a value We are looking for a path that maximizes f In each iteration, we choose the best part Q of the sample The probability update formula for an edge e=(v,w) is #paths in Q that use e f’(e) = #paths in Q that go via v

  11. Cross-entropy for testing • A program is viewed as a graph • Each decision point is a node in the graph • Decision points can result from any non-deterministic or other not predetermined decisions: • The performance function is defined according to the bug that we want to find • More on than later … concurrency inputs coin tossing

  12. Our implementation • We focus on concurrent programs. • A program under test is represented as a graph, with nodes being the synchronization points. • Edges are possible transitions between nodes. • The graph is assumed to be DAG – all loops are unwound. • The graph is constructed on-the-fly during the executions. • The initial probability distribution is uniform among edges. • We collect a sample of several hundreds executions. • We adjust the probabilities of edges according to the formula. • We repeat the process until the sample has a very small relative standard deviation. this works only if there is a correct locking policy 1-5%

  13. for i=1 to 100 do sync node; end for i mod 2 sync node odd sync node even Dealing with loops • Unwinding all loops creates a huge graph. • Problems with huge graphs: • Takes more space to represent • Takes more time to converge • We assume that most of the time, we are doing the same thing on subsequent iterations of the loop. for instance, modulo 2 creates two nodes for each location inside the loop – for even and for odd iterations • We introduce modulo parameter. • It reduces the size of the graph. dramatically, but also loses information • There is a balance between a too-small and a too-large modulo parameter that is found empirically.

  14. Bugs and performance functions note that we can also test for patterns, not necessarily bugs

  15. program under test ----------- ----------- ----------- Implementation – in Java for Java Instrumentation Stopper Decider probability distribution table ConCEnter Updater Evaluator disk

  16. Experimental results • We ran ConCEnter on several examples with buffer overflow and with deadlocks. • The bugs were very rare and did not manifest themselves in random testing. • ConCEnter found the bugs successfully. • The method requires significant tuning: the modulo parameter, the smoothing parameter, correct definition of the performance function, etc. Example: A-B-push-pop myName=A // or B – there are two types loop: if (top_of_stack=myName) pop; else push(myName); end loop; thread B thread A thread A thread B x10 36 A the probability of stack overflow is exponentially small B A

  17. Future work • Automatic tuning. • Making ConCEnter plug-and-play for some predefined bugs. • Replay: can we use distance from a predefined execution as a performance function? • Second best: what if there are several areas in the graph where the maximum is reached? • What are the restrictions on the performance function in order for this method to work properly? works already seems that the function should be smooth enough

  18. Related work • Testing: • Random testing • Stress testing • Noise makers • Coverage estimation • Bug-specific heuristics • Genetic algorithms • … • Cross-entropy applications: • Buffer allocation, neural computation, DNA sequence alignment, scheduling, graph problems, … nothing specifically targeted to rare bugs cross-entropy is useful in many areas

More Related