110 likes | 126 Views
The Structure of the GNAT Compiler. A target-independent Ada95 front-end for GCC. Ada components C components. GCC tree. Syntax. Sem. Expand. gigi. GCC. AST. Annotated AST. Your Project!. Lexical Scanner. Hand-written for speed
E N D
A target-independent Ada95 front-end for GCC • Ada components • C components GCC tree Syntax Sem Expand gigi GCC AST Annotated AST Your Project!
Lexical Scanner • Hand-written for speed • Must handle different encodings for extended character sets (ISO642). • Input file is read in full to minimize I/O • Data structure: global name table • Scanner is subroutine for parser, delivers tokens on demand • Preprocessor may be integrated with scanner (but Ada has no defined preprocessor).
Parser • Hand-written for speed and error recovery • Organized according to chapters of the language reference (ARM): • par-ch3 handles declarations, par-ch9 handles tasking, par-ch12 handles generics, etc. • Recursive-descent with lookahead • Follows exactly ARM grammar (ambiguous!) • Builds AST: main data structure for front-end • Symbol table is integrated into AST.
Semantic Analysis • Legality rules (1436 error and warning messages) • Tree annotations: add structure to AST to localize semantic information • Name resolution (which x?) • Type and overload resolution (which “+”?) • Dispatching and polymorphism • Static expression evaluation • Simple optimizations: dead code elimination, constant folding, static conditions.
Expansion • Replace complex constructs with simpler one (map Ada semantics into C semantics) • Aggregates: (1..10 => 42, others => 0) • Equality on composite types: if A (x .. y) = B (x . .y) then.. • Tasking (thread communication) • Many others… • Expander builds AST fragments, calls semantics to annotate them: process is recursive
Gigi: GNAT to GCC • Impractical to use GCC data structures in front-end • Gigi traverses GNAT tree fragments and calls tree-building procedures (just like other GCC front-ends) • Each tree fragment is immediately translated into Register Transfer Language (RTL) • Gigi must reflect Ada semantics (syntax and semantic information packages) and must interface to GCC generators: written in C.
Other components • Library management: program is assembled from compilation of multiple files. Need tools to insure coherence (like make) gather object files, record dependencies between units. • gnatbind, gnatlink • Runtime management: Input/Output, numeric libraries, tasking, real time, distributed computing.
The data structures of the AST • Nodes are variant records of fixed size. Each variant corresponds to one non-terminal • A node has at most 4 syntactic descendants • A descendant is either a node or a list • Terminal nodes are identifiers and literals • Node kind is discriminant: determines meaning of descendants for each node • 220 kinds: • Compilation_unit, package_declaration…loop_statement… • Syntactic structure described in sinfo.ads
Semantic annotations • Each tree node carries semantic information • Node-specific: • expressions have types • procedures have scope • array types have index and component types • Literals have a value • Full description in einfo.ads
Symbol table is integrated into AST • Identifiers in declarations are special: defining_occurrence accumulates semantic information for an entity • References to an identifier are pointers to corresponding defining_occurrence: Entity. • Symbol table is set of defining_occurrences. • Identifiers point to names table • Symbol table contains no strings