430 likes | 554 Views
Presentation 4. Cross Language Clone Analysis Team 2 October 13, 2010. Agenda. Current Tasks Spike – GOLD Parser Demo Project Layout Team Collaboration Path Forward. Our Team. Allen Tucker Patricia Bradford Greg Rodgers Brian Bentley Ashley Chafin. Current Tasks.
E N D
Presentation 4 Cross Language Clone Analysis Team 2 October 13, 2010
Agenda • Current Tasks • Spike – GOLD Parser • Demo • Project Layout • Team Collaboration • Path Forward
Our Team • Allen Tucker • Patricia Bradford • Greg Rodgers • Brian Bentley • Ashley Chafin
Current Tasks What we are tackling…
Current Tasks (Review) • Current tasks created for the first user story “Source Code Load & Translate”: • Load & parse C# source code. • Load & parse JAVA source code. • Load & parse C++ source code. • Translate the parsed C# source code to CodeDOM. • Translate the parsed JAVA source code to CodeDOM. • Translate the parsed C++ source code to CodeDOM. • Associate the CodeDOM to the original source code.
GOLD Parsing System Spike
Topics To Discuss • What is it? • How does it work? • What can we use it for? • How can we extend it?
What Is GOLD? • GOLD is a free parsing system that you can use to develop your own programming languages, scripting languages and interpreters. It strives to be a development tool that can be used with numerous programming languages and on multiple platforms. – www.devincook.com/goldparser
How It Works (Block Structure) Source Code Grammar Builder Compiled Grammar Table (*.cgt) Engine Parsed Data
How It Works (Components) Source Code Grammar Builder Compiled Grammar Table (*.cgt) Engine Parsed Data Three Major Components Builder – Reads a source grammar to construct a Compiled Grammar Table Compiled Grammar Table – Stores LALR and DFA parse tables Engine – Performs actual parsing
How It Works (Process) Source Code Grammar Builder Compiled Grammar Table (*.cgt) Engine Parsed Data • Step 1 • Write the grammar for the language being implemented. (GOLD-Meta Language) • Rules: Backus-Naur Form • Terminals: Regular Expressions • Character sets: Set Notation
How It Works (Process) Source Code Grammar Builder Compiled Grammar Table (*.cgt) Engine Parsed Data • Step 2 • Analyze Grammar • Construct LALR and DFA parse tables which are saved in a Compiled Grammar Table file.
How It Works (Process) Source Code Grammar Builder Compiled Grammar Table (*.cgt) Engine Parsed Data • Step 3 • Analyze source text with parser engine and construct parse tree • Engine can be implemented in any number of programming languages
Usage within CloneDigger Source Code Compiled Grammar Table (*.cgt) Engine Parsed Data CodeDOM Conversion AST • CodeDOM Conversion • Need to write routine to move data from Parsed Tree to CodeDOM • Parsed data trees from parser are stored in consistent data structure, but are based on rules defined within grammars
Task Understanding • Three Step Process • Step 1 Code Translation • Step 2 Clone Detection • Step 3 Visualization Common Model Translator Source Files Detected Clones Inspector Common Model Clone Visualization UI Detected Clones
Extension and Enhancements Source Code Grammar Builder Compiled Grammar Table (*.cgt) Engine Parsed Data • Enhance Grammars • Update Java • Update C# • Define C++ • Share among other classmates with similar interest • Share with greater community
Grammars • What is a grammar? • A set of rules of a specific kind, for forming strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context —only their form.
Gold Parser Grammars • Gold Parser uses context-free grammars that can be used to do Lookahead Left-to-Right (LALR) parsing. • LALR compliant grammars that we already have: • C# • Java • Visual Basic .Net
C++ Grammar Issue • Currently no LALR compliant C++ grammar exists due to the overall complexity. • Other C++ parsers exist, but give an output format different than the other languages we already have grammars for using Gold Parser. • We are still searching for C++ parsing solutions.
GOLD Parser Conclusion • We plan to use GOLD Parsing System. • Tasks we have to complete: • Update JAVA grammer • Update C# grammer • Research “Define C++ grammer” • Create a CodeDOM conversion to move data from Parsed Tree to CodeDOM
Demonstrations GOLD Parsing System
Project Layout Key Points, Architecture, & Unit Test
Key Architecture Points • Multilanguage support • Configurable for different platforms • Stand-along application • plug-in • backend service • Extendable
Architecture User Interface Communication Layer Core Clone Detection Algorithms Code Model API Language Service Interface C# Service Java Service C++ Service
Core Unit • Code Model • Stores the code in common format • Application Programming Interface • Used to embed clone detection in applications • Language Service Interface • Communication layer between the core and the specific language services Core Clone Detection Algorithms Code Model API Language Service Interface
Team Collaboration Team 2 & Team 4
Team Collaboration • Due to Team 4’s team size, we have taken responsibility of gathering & sharing grammers. • Both Teams will… • Use the same grammers & engines • We will both have limitations based on this. • Ex: JAVA grammer is based off 1.4 -> we are limited to using JAVA 1.4 • Test the same grammers & engines • We will have two test beds.
Team Collaboration • Method of collaboration: • Google code project site: • http://code.google.com/p/uah-studio-2010-2011/ • Team 4 team members have access to this site. • Meetings • Email • What does our google code project contain? • Source control for grammers & engines • Bugs/Issues • Team 4 will have ability to document new bugs. • Documents/Artifacts
Path Forward Next Iteration & Schedule
Path Forward Finalize Iteration 1 Iteration 2 Planning/Elaboration