TkSee/SN Experiences with the parser benchmark

TkSee/SNExperiences with the parser benchmark IWPC 2002, Paris S. Marchenko and T. Lethbridge

SN 5.0 parsing engine. Parser Architecture tkSee/SN parser Pass 2. generating cross- reference information C/C++ source C/C++ parser Direct code info extractor Cross- reference data file SN DB Manager Pass 1. generating code information GXL File(s) TCL API SN DB API GXL generator SN DB System calls

Source Navigator Database Structure Constants Typedefs Classes Referenced-TO… Enumerations Referenced-BY… Enumerated constants Method implementations Function declarations Method declarations Function implementations Macros File symbols Instance variables Global variables Inheritances Inclusions

TkSee/SN on the benchmark: Positive points • Parser always produced output for all tests • No crashes • Benchmark results can be easily recreated by other. • We use “free” software tools including our parser. • Answer-generation scripts are included along with the results). • For “generated code” results we process the embedded code in the original lex/yacc source along with the generated results

TkSee/SN on the benchmark: Negative points. • Accuracy and completeness of information extracted by parser was limited by supported language dialects: • Not all examples were C++, K&R C, and ANSI C, which we support • Some tests were apparently from other dialects • Not supported: • Pre-processing directives. • Namespaces • Operator overloading. • Parser is not capable of “run-time” call resolution

Feedback on the benchmark (1) • Benchmark helped us to fix a few bugs in our parser • Related to accuracy of results • Additional questions added for this iteration were useful • Extraction of positions information in terms of offset in bytes from the beginning of the source files: • This was formally possible in TkSee • We moved away from it to line/character positions due to an attempt to conform to what everybody else is doing!

Feedback on the benchmark (2) • “Heterogeneous” examples • E.g. combination of C/SQL • Benchmark should perhaps not ask about the content of the embedded language • since the main intention of the test is comparison of C/C++ parsers. • Some questions had to be answered rather indirectly • E.g. Number of Macros used to compute number of #define • DMM did not fully support some questions, but we could ‘stretch’ it!

TkSee/SN Experiences with the parser benchmark