1 / 12

Code recognition & CL modeling through AST   

Code recognition & CL modeling through AST   . Xingzhong Xu Hong Man. Outline. Introduction of AST in SSP AST for Code Recognition AST for Cognitive Linguistic Modeling Summary and Future Work. Introduction of AST in SSP.

jaxon
Download Presentation

Code recognition & CL modeling through AST   

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Code recognition & CL modeling through AST    XingzhongXu Hong Man

  2. Outline • Introduction of AST in SSP • AST for Code Recognition • AST for Cognitive Linguistic Modeling • Summary and Future Work Semantic Signal Processing Stevens

  3. Introduction of AST in SSP • Most language application use Abstract Syntax Tree(AST) as an Intermediate Representation(IR) to help the computer semantically understanding code in programming domain.* • Signal Processing Code • How to semantically analyzing it? • How to semantically modeling it? for (i = 0; i < n; i++){ acc0 += d_taps[i] * input[i]; } *Terence Parr, The Definitive Antlr Reference: Building Domain-Specific Languages (Pragmatic Programmers), 2007 **ANTLR Semantic Signal Processing Stevens

  4. Code Recognition  • In order to perform code re-hosting and other semantic code analysis, we may firstly recognize the functionality of each code segment. • In Computer Science, there are two approaches to perform Code Recognition: • AST based recognition [Gabel, 2008] [Roy 2009] • Generate the AST • Perform Tree Matcher • Random Test based recognition [Jiang, 2009] [Bertran, 2005] • Segment the code • Test the I/O behavior  Semantic Signal Processing Stevens

  5. Code Recognition  • AST represents the source code in programming domain.  • Radio and computational primitives has their feature in AST. • Filter ≈ LOOP + ACCUMULATION + MULTIPLY for (i = 0; i < n; i++){ acc0 += d_taps[i] * input[i]; } Semantic Signal Processing Stevens

  6. Code Recognition Result • In order to test the idea, I design a Code Recognition demo (not fully debugged). • Source: GNU-Radio 3.2.2 (C++) • Objective: Recognize and print the filter code. • Platform: Ubuntu 10.04 + Java SE 1.6+ ANTLR 3.2 • Process: • Generate AST for each C++ file.  • Match the filter sub-tree pattern. • Print the matched code segment. Semantic Signal Processing Stevens

  7. Code Recognition Result • Result: • Totally 932 C++ source files in GNU-Radio. • 689 files successfully analyzed (to be continued). • 59 filter patterns found. for (i = 0; i < n; i += N_UNROLL){   acc0 += d_taps[i + 0] *  input[i + 0];   acc1 += d_taps[i + 1] *  input[i + 1];   acc2 += d_taps[i + 2] *  input[i + 2];   acc3 += d_taps[i + 3] *  input[i + 3]; } for (int j = 0; j < d_len; j++) {if (j != 0)d_pn= 2.0*d_reference->next_bit()-1.0; sum += *in++ * d_pn;} for (i=0; i < d_ff_taps.size(); i++)      acc += conj(d_ff_delayline[(i+d_ff_index) & ff_mask]) * d_ff_taps[i]; Semantic Signal Processing Stevens

  8. CL Modeling  • Intermediate Representation: • AST (Programming Domain) • CL Modeling (Signal Processing Domain) k = N – i; Semantic Signal Processing Stevens

  9. CL Modeling • Rewrite and mapping the structure and tokens from the AST to CL Modeling Tree. k = N – i; Semantic Signal Processing Stevens

  10. CL Modeling Result • In order to test our idea, I designed a CL Modeling demo based on AST.* • One tree rewriter will translate and modify the current AST to CL Modeling Tree. • Based on the CL Modeling Tree, print the CL Modeling XML file. https://sites.google.com/site/stevensxingzhong/home/clmb *Terence Parr, Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, Pragmatic Programmers, 2010. Semantic Signal Processing Stevens

  11. Summary & Future Work  • The programming domain AST is a key interface for language application, in SSP project: • Code Recognition: Determine the functionality of the code segment. • Cognitive Linguistic Modeling: As an intermediate form to modeling the radio code. • Future Work: • Cover more code, C++, Matlab, VHDL etc. • Discover more computational and radio primitive. • Fully support CL Modeling.  Semantic Signal Processing Stevens

  12. Reference • Jiang L. and Su, Z. 2009. Automatic Mining of Functionally equivalent code fragments via random testing. In Proceedings of the Eighteenth international Symposium on Software Testing and Analysis. • Gabel, M., Jiang, L., and Su, Z. 2008. Scalable detection of semantic clones. In Proceedings of the 30th international Conference on Software Engineering. • C.K. Roy, J.R. Cordy and R. Koschke B. 2009. Comparison and Evaluation of code Clone Detection Techniques and Tools: A Qualitative Approach. Science of Computer Programming. • Bertran, M., Babot, F., and Climent, A. 2005. An Input/Output Semantics for Distributed Program Equivalence Reasoning. Electron. Notes Theor. Comput. Sci. 137,1 (Jul.2005) • Terence Parr, The Definitive Antlr Reference: Building Domain-Specific Languages (Pragmatic Programmers), 2007 • Terence Parr, Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, Pragmatic Programmers, 2010. Semantic Signal Processing Stevens

More Related