240 likes | 405 Views
Precise Memory Leak Detection for Java Software Using Container Profiling. Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th , 2008 Presented by Eun Jung Park. The Problem - Why we are unhappy with memory leak?. Example of memory leak/dangling pointer in C/C++ How about in Java ?
E N D
Precise Memory Leak Detection for Java Software Using Container Profiling Guoquing Xu, Atanas Rountev Ohio State University Oct 9th, 2008 Presented by Eun Jung Park
The Problem- Why we are unhappy with memory leak? • Example of memory leak/dangling pointer in C/C++ • How about in Java? • GC (Garbage Collector) will handle this! • Then what is memory leak problem in Java? int *pi; void foo() { pi = (int*) malloc(8*sizeof(int)); // oops, memory leak of 4 ints // use pi free(pi); // foo() is done with pi } void main() { pi = (int*) malloc(4*sizeof(int)); foo(); pi[0] = 10; // oops, pi is now a dangling pointer } Above example is from http://www.ibm.com/developerworks/rational/library/05/0816_GuptaPalanki/index.html
The Problem- We are still unhappy with Java Memory Leak • What is Java Memory Leak? • Object references that are no longer needed are unnecessarily maintained. They will not disappeared by GC. • Why Java Memory Leak is bad? • It can degrade the performance. • It can eventually cause running out of memory and crash. • It is difficult to find. public void slowlyLeakingVector(int iter, int count) { for (int i=0; i<iter; i++) { for (int n=0; n<count; n++) { myVector.add(Integer.toString(n+i)); } for (int n=count-1; n>0; n--) { // Oops, it should be n>=0 myVector.removeElementAt(n); } } } Above example is from http://www.ibm.com/developerworks/rational/library/05/0816_GuptaPalanki/index.html
State-of-the-Art- How people detect memory leak? • Static method using compiler or code analysis • Not precise: Usually they cannot precisely identify these unnecessary references. • Not scalable: It is not good to use for large application. • Dynamic method using fine-grained runtime information about individual objects with single information - memory contribution or staleness contribution. • Not precise: They use from-symptom-to-causeapproach and it can be difficult to locate the source of the leak and cause the imprecise leak reports. (possible false positive) • Hard to interpret and not sufficient for programmers: The output is too complex to interpret and lack of precision. Also the output is not enough to locate a bug for programmers.
Proposed Method • Dynamic method with container-based heap-tracking • Instead of using from-symptom-to-cause Only track containers to directly identify the source of the leak. • Instead of using single information Computeheuristic confidence value for each container based on the combination of • Overallmemory consumption • Each container’s memory consumption • Each container’s staleness contribution • What is definition of container? An abstract data type (ADT) with a set of data elements and three basic operations ADD, GET, and REMOVE. (e.g., hash table, graphical element) • Why container is suspicious? Container causes many memory leaks in Java!
Major Contribution • Contribution 1: Computing a Confidence Value • Contribution 2: Java Memory Leak Detection • Contribution 3: Implementation • Contribution 4: Empirical Evaluation
Contribution1- Computing a Confidence Value Define Memory Leak Symptom Define Memory Leak Free Choose Non Memory-Leak-Free Containers Calculate Memory Contribution Calculate Staleness Contribution Put them together! Calculate Leaking Confidence
Computing a Confidence Value- Definition of Memory Leak Symptom • A program written in garbage-collected language has a memory leak symptom within [ , ] if • (1) Memory consumption at the moment immediately after gc-events in the region, • (2) There exists a subsequences of gc-events, memory consumption at each gc-events keeps growing • How to define and ? • by offline: Ending of the program or the out of memory error. • by online: User-defined. gc_events will be a check-points. • : Choose the smallest user-defined ratio to get the longest region and more precise analysis. • This helps to identify the appropriate time region to analyze
Computing a Confidence Value- Definition of Memory Leak Free • A container is memory-leak-free if (1) at the end of leak region, the number of element is 0 (2) all elements added were removed and garbage collected within the leak region. This means that # of ADD = # of REMOVE . • Why we this need definition? Containers that are not memory-leak-free will "possibly" contribute to the memory leak symptom and considered for further evaluation. • We choose container that is not memory-leak-free and we are ready to go to next step!
Computing a Confidence Value- Definition of Memory Contribution • Memory time graph is used to capture a container's memory footprint. • x-axis: the relative time of program execution at • y-axis: the relative memoryconsumption of a container at • Staring point: / , where =max( , allocation time of container) • Ending point: / , where =min( , deallocation time of container) • Container’s memory contribution is defined as the area covered by the memory consumption curve in the graph. x-axis = , y-axis= at Memory consumption of all reachable objects from container Total amount of memory consumption of a container MC
Computing a Confidence Value- Definition of Staleness Contribution • Staleness: the time since the object's last use. • How calculate Staleness? time diff between and where, • : the moment that element was removed from a container. • : the moment that element was added into a container or retrieved from a container. • Condition: no retrieval of element between and . • If < ? • If < ? • If an element is never removed from a container? • How calculate Staleness contribution? When we have element in a container, MC SC
Computing a Confidence Value- Now, Put Them Together! • Combining MC and SC, we get Leaking Confidence defined as • Why LC as an exponential function of SC? SC is more important than MC in determining a memory leak. • Desirable Properties MC SC LC
Contribution2 - Java Memory Leak Detection Leak symptom Leak free Non Leak-free Containers Container Modeling MC SC LC Code Instrumentation Instrumented code with glue class Profiling Data Analysis Information of Potential leaking containers Leaking Call Sites
Java Memory Leak Detection- Container Modeling • For each container, corresponding glue class • Provided for all types in the Java collection frameworks. • User's annotation required for user-defined container. • These glue methods call profiling library to pass • For instrumentation step: call site ID • For SC computation • the container object • the element object • the number of elements in the container before the operation is performed • operation types are used for SC computation
Java Memory Leak Detection- Code Instrumentation • Soot analysis framework is used • Calls to the corresponding glue methods are inserted before and/or after the call site. • Code is inserted after a container is allocated in order to track its allocation time. • Escape analysis: They do not include thread-local and method-local containers since their lifetime is limited within their allocating methods.
Java Memory Leak Detection- Container Profiling • Perform profiling with JVMTI • Data for MC values • Data for SC values • What JVMTI helps for profiling? • Activate an object graph traversal thread • Calculate the deallocation time of a tagged container. • Activate a dumping thread to prevent too much profiling data in memory for performance.
Java Memory Leak Detection- Data Analysis • When we reach to the ending of leak region,tool starts offline analysis to • Determine leaking region • Approximate the memory time graph and MC value • Compute SC
Java Memory Leak Detection- Leaking Call Sites • For each element in a container, tool calculates the average staleness of each call sites. • Tool reports to programmers (testers) • potentially leaking containers sorted by LC value • potentially leaking call sites sorted by average staleness in each container
Contribution 4: Empirical Evaluation- Experiments Setup • Hardware Platform: 2.4 GHz Dual-core, 2GB RAM • Three memory leak bugs • Two are from Sun Bug Repository • One is from SPECjbb • Method • Check how successfully their tool can locate a memory leak bug in three different sampling/dumping rates (1/15gc, 1/50gc, 1/85gc) • Check overhead and performance by measuring instrumentation overhead, runtime with different size of heap in different sampling rate, and the overhead of using their tool. • What they want to show here? • Their tool achieved high precision in reporting causes for memory leak bug with acceptable overhead for practical use!
Empirical Evaluation- Detecting Memory Leak Bugs – JDK bugs JDK bug #6209673 1. Enough information for programmers to locate bugs. 2. Sampling rate: 1/15gc and 1/50gc is better than 1/85gc.1/50gc is the best for tradeoff between performance and preciseness. Image = (VolatileImage)volatileMap.get(config) JDK bug #6559589 addElement(weakWindow)
Empirical Evaluation- Detecting Memory Leak Bugs - SPECjbb • Requires user-defined container glue class. Before this, tool couldn’t locate a memory leak bug successfully • After modeling, it found the correct place for memory leak bug • 1/50gc showed the best performance and preciseness. orderTable.put(anOrder.getId(), anOrder)
Empirical Evaluation- Performance & Overhead • Dynamic Overhead • # of gc-events and runtime with the default vs. large heap size in two different samplings • Static Overhead • # of call sites instrumented • Static overhead of tools Overhead of usingthis tool • Applying escape analysis reduces the number of call sites • In the same sampling rate, large initial heap size uses smaller running time • Decreasing the sampling rate reduces the runtime overhead
Conclusion • Why they are different from existing dynamic method? • Instead of focusing on arbitrary objects, they only focus on containers main contributor of memory leak problem. • They consider the combination of MC and SC, not single. • They locate a bug more precisely • Programmer or testers only need to learn how to add glue class and can use their tool easily instead of learning how to interpret complex outputs from existing tools. • Contributions • Contribution 1: Computing a Confidence • Contribution 2: Java Memory leak detection • Contribution 3: Implementation • Contribution 4: Empirical Evaluation
Future Work & Discussion • How about overhead? • Need optimization to reduce overhead • Using JikesRVM to avoid JVMTI • Automated tool • Automate the mapping between container methods to the ADT operations • Alternative definition of LC for more precisely information • More context information about containers and call sites that can be useful for programmers.