90 likes | 192 Views
Fakultät Informatik Institut Software- und Multimediatechnik, Lehrstuhl Softwaretechnologie. EMFText Code Completion (All the ugly details). EMFText Meeting, Dresden, 30.11.2009. How does Code Completion work (conceptually)? Parse an incomplete document (up to the cursor position)
E N D
Fakultät InformatikInstitut Software- und Multimediatechnik, Lehrstuhl Softwaretechnologie EMFText Code Completion (All the ugly details) EMFText Meeting, Dresden, 30.11.2009
How does Code Completion work (conceptually)? • Parse an incomplete document (up to the cursor position) • Provide a list of strings that may appear next (or that complete the current incomplete element) Why is this complicated? • How to (statically) compute what “may appear next” • How to collect the “may be next” items during parsing • ANTLR uses predictive parsing (lookahead) • How to derive concrete proposals (strings) from the list of next elements (terminals)
How to (statically) compute what “may appear next” Class ExpectationComputer can compute FIRST and FOLLOW sets for syntax definitions FIRST is the set of terminals that a syntax element can start with, e.g.:M1 ::= (“a”)? “b” “c” FIRST(M1) = {“a”,”b”}FOLLOW is the set of terminals that can follow right after a syntax element, e.g.: M2 ::= “d” (“e”)? “f” FOLLOW(“d”) = {“e”, “f”}
2. How to collect the “may be next” items during parsing • Add some code in the semantic action sections • After each matched token the list of FOLLOW terminals is added via addExpectedElement(new ExpectedTerminal(...))
3. ANTLR uses predictive parsing (lookahead) • ANTLR discards some parse paths depending on the lookahead, e.g.,:Field ::= visibility[] type[] name[] “;”;Method ::= visibility[] “void” name[] “;”;public class C1 { public <CURSOR>} (instance of org.emftext.test.cct1) • Problem: Generated parser does never “eat” the second “public” token, because the lookahead can find neither a type not a “void” token. The expected elements (type and void) are never added.
3. ANTLR uses predictive parsing (lookahead) • Solution: Let generated parser run up to the position in the document where the text is complete (i.e., before the second “public”) • The remaining tokens are parsed “manually” but reducing the FOLLOW set iteratively public class C1 { public void m1() {}| public <CURSOR>} TEXT … public, } TypedElement.type, void } ANTLR Parsing „stops“ here
4. How to derive concrete text proposals from the list of next elements (terminals) • Must consider prefix • Determine prefix using token positions (see setPositions()) • Match using startsWith() and StringUtil.match() • ReferenceResolver should use those methods as well to allow camel case code completion (c4) • Keywords (CsStrings) are easy • StructuralFeatures • Attributes (Pick default value depending on type) • Non-containment references (Call ReferenceResolverSwitch)
Open Issues • Static Computation of FOLLOW sets yield overapproximation (e.g., if a rule is contained in multiple others) • Dynamic context needed! • Interface of ReferenceResolvers needs to be changed for fuzzy resolving (Parameter 'resource' instead of 'container', because 'container' does often not exist) • Introduce IReferenceResolver2? • Computation of FIRST and FOLLOW set could be faster (some sets are computed multiple times) • Add cache • If the cursor is at the end of a document some proposals are missing because the LocationMap is wrong
Thank you!Questions? http://www.emftext.org http://jamopp.inf.tu-dresden.de