320 likes | 362 Views
自主决定命运 , 创新成就未来 www.loongson.cn. JVM virtual method invoking optimization based on CAM table. Songsong Cai Institute of Computing Technology, Chinese Academy of Sciences caisongsong@loongson.cn 28/7/2011. Outline. Introduction. Monomorphic Inline Caching in HotSpot.
E N D
自主决定命运,创新成就未来www.loongson.cn JVM virtual method invoking optimization based on CAM table Songsong Cai Institute of Computing Technology, Chinese Academy of Sciences caisongsong@loongson.cn 28/7/2011
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
Methods in Java Programming Language (1) • Class (static) Method • The class method does not require an instance • Class method uses static binding • When JVM calls a class method, it will be based on the type of object reference (usually known while compiling) to select the call method
Methods in Java Programming Language (2) • Instance (virtual) Method • The instance method needs an instance • Instance method uses dynamic binding • when calling an instance method, the virtual machine will be based on the actual class object (only known while running) to select the call method • The information of type can be known only when JVM runs to the call site • The dynamic resolution is generally translated into an indirect jump, which can usually lead to pipeline stall • The instance method takes up a large proportion, such as in SPECjvm98, a virtual method call occurs in Java program every 12-40 bytecodes
Types of method invocation in Java • invokestatic • process the static method invocation • The entrance of the constant pool includes a symbolic reference of the target method, then JVM pops the parameters and executes the target method • invokevirtual • process virtual method invocation • JVM needs to pops the object reference and the parameters before the execution of the target method • Invokespecial • process virtual method invocation • invokeinterface • process virtual method invocation
The percentage of virtual method invocation The percentage of virtual method invocation in SPECjvm2008 benchmark
Related Work : Inline Caching • Origin • The call type in the same call site will not change frequently • With this locality, we can cache the call type in the call site • Kinds • Monomorphic inline caching • store methods and the corresponding type value at the call site in an inline way • For each virtual method call, it compares the type values to jump to the target value method, rather than searches for objective method in many target of virtual method • Polymorphic inline caching • different types of target method will be recorded at the same call site • The type value of current call can be compared with these types of storage in turn, until the matching type is found, the program jumps to the corresponding target method
Shortage of Inline Caching • Monomorphic inline caching • It cannot handle the case that several different types of methods are called frequently in the same call site • Polymorphic inline caching • Although polymorphic inline caching can solve the problem above, its complex implementation will result in additional costs
Solution: Software and Hardware co-design • Hardware (CAM table) • We design and implement CAM (content associated memory) table to index and search the virtual target method • The CAM table is implemented by hardware and can be managed by software • Software (Efficient Algorithm) • With the CAM table, we optimize the virtual method invocation that the target method can be resolved easily • The program can jump to the target method directly, rather than resolve the virtual method dynamically at runtime
Thesis Contributions • System architecture • We present a Java Virtual Machine system with high performance of virtual methods invocation. The JVM is simple, but efficient • CAM lookup table • We design and implement the CAM hardware lookup table to help resolve the virtual method. The target method can be easily resolved with the CAM table • Efficient algorithm of virtual method invocation • With the CAM table, we present a virtual method invoking algorithm based on software and hardware co-design. The algorithm attains a relatively high performance on virtual method invocation
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
The Virtual Method Invocation in HotSpot • HotSpot • the core of the open source project Openjdk6 • standards and stability • The invocation of virtual method uses optimized monomorphic inline caching
A bad case Var value = {1,”a”,2,”b”,3,”c”,4,”d”}; For (var I in values){ Document.write(values[i].toString()); } • the program always calls target methods with different types, the performance loss can be very large • Although such extreme case is rare, the type of virtual method call changes very commonly, so the performance loss caused by the overhead of virtual method call is very serious
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
Operating Instructions of CAM Table • Instructions • CAMPI • look up CAM table according to the index • CAMPV • look up CAM table according to the value • CAMWI • write CAM table according to the index • Usage • All CAM entries can be written by the instruction CAMWI, and RAM value can be read by the instruction CAMRI. Instruction CAMPI and CAMPV are used to look up CAM
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
Evaluation Platform • Software • Hotspot • Hardware • Loongson-3 Processor • 4-core high-performance general-purpose processor • CAM table is implemented in the processor • After we add the CAM table, the whole processor area increases less than 5 ‰, the power consumption increases less than 1‰, the cost is negligible.
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
Conclusions • Problem • The performance loss resulted from the dynamic method resolution of virtual method call is always an important reason that causes the poor performance of Java language • Solution • Design and achieve the hardware of CAM lookup table • Present a mechanism of virtual method call based on hardware and software co-design • Performance improvement • In the case that there are frequently multiple types of target method at the same call site • the virtual hit rate increases from 13.3% to 76.4% • the performance of the program improves by 16.2% • it improves the performance of SPECjvm98 by 6.4% on average
Outline Introduction Monomorphic Inline Caching in HotSpot Hardware Design of CAM Used in Virtual Method Call SW/HW Co-design Virtual Method Invoking Mechanism Experimental Results and Analysis Conclusions References
References (1) • J. Gosling, B. Joy, G. Steele, and G. Bracha. The JavaTM Language Specification. Addison-Wesley, 3rd edition, 2005. • B. Venners, Inside the Java virtual machine: McGraw-Hill Professional, 1999. • Karel Driesen. Efficient Polymorphic Calls. The Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publisher, 2001. • K. Driesen, P. Lam, J. Miecznikowski, F. Qian, and D. Rayside. On the predictability of Java byte codes (abstract) (poster session), In: Addendum to the 2000 proceedings of the conference on Object-oriented programming, systems, languages, and applications (Addendum). Minneapolis, Minnesota, United States: ACM, pp. 127-128, 2000. • L. P. Deutsch and A. M. Schiffman. Efficient implementation of the smalltalk-80 system, In: Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. Salt Lake City, Utah, United States: ACM, pp. 297-302, 1984. • D. M. Ungar, The design and evaluation of a high performance Smalltalk system. 1986. • D. Ungar and D. Patterson, What Price Smalltalk. Computer;(United States). 20(1), 1987. • http://en.wikipedia.org/wiki/Inline_caching. • [J. Dolby and A. Chien. An automatic object inlining optimization and its evaluation. In: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, Vancouver, British Columbia, United States: ACM, pp. 345–357, 2000.
References (2) • O. Lhot´ ak and L. Hendren. Run-time evaluation of opportunities for object inlining in Java. Concurrency and Computation: Practice and Experience, 17(5-6): pp. 515–537, 2005. • V. Sundaresan, L. Hendren, C. Razafimahefa, R. Vallée-Rai, e-Rai, P. Lam, E. Gagnon, and C. Godin. Practical virtual method call resolution for Java, In: Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Minneapolis, Minnesota, United States: ACM, pp. 264-280, 2000. • T. Kotzmann and H. M¨ ossenb¨ ock. Escape analysis in the context of dynamic compilation and deoptimization. In: Proceedings of the ACM/USENIX International Conference on Virtual Execution Environments, Chicago, United States: ACM, pp. 111–120, 2005. • U. Hölzle, D. Ungar. Optimizing dynamically-dispatched calls with run-time type feedback, In: Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation. Orlando, Florida, United States: ACM, pp. 326-336, 1994. • R. Veldema, C. J. H. Jacobs, R. F. H. Hofman, and H. E. Bal. Object combining: A new aggressive optimization for object intensive programs. Concurrency and Computation: Practice and Experience, 17(5-6): pp.439–464, 2005. • R. Griesemer and S. Mitrovic, A compiler for the Java HotSpot(tm) virtual machine. The School of Niklaus Wirth: The Art of Simplicity. pp. 133-152. • D. F. Bacon and P. F. Sweeney, Fast static analysis of C++ virtual function calls. ACM SIGPLAN Notices. 31(10): pp. 324-341, 1996. • Craig Chambers and Weimin Chen. Efficient Multiple and Predicated Dispatching. ACM SIGPLAN Notices. 34(10): pp. 238-255, 1999.
Thanks! 北京市海淀区中关村科学院南路10号100190 No.10 Kexueyuan South Road,zhongguancun Haidian District,beijing 100190,china