290 likes | 774 Views
Why study compilers?. Ties lots of things you know together: Theory (finite automata, grammars) Data structures Modularization Utilization of software tools You might build a parser. The theory of computation/formal language still applies today.
E N D
Why study compilers? • Ties lots of things you know together: • Theory (finite automata, grammars) • Data structures • Modularization • Utilization of software tools • You might build a parser. • The theory of computation/formal language still applies today. • As long as we still program with 1-D text. • Helps you to be a better programmer
One-dimensional Text int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; The formatting has no impact on the meaning of program int x;cin >> x;if(x>5) cout << “Hello”; else …
What is a translator? • Takes input (SOURCE) and produces output (TARGET) SOURCE TARGET ERROR
Types of Target Code: • “Pure” machine code • No operating system required. • No library routines. • Good for developing software for new hardware. • “Augmented” code • More common • Executable code relies on o/s provided support and library routines loaded as program is prepared to execute.
Conventional Translator skeletal source program source program preprocessor compiler target assembly program absolute machine code loader / linker assembler relocatable machine code library, relocatable object files
Types of Target Code (cont.) • Virtual code • Code consists entirely of “virtual” instructions. • Used by “Re-Targetable” compilers • Transporting to a new platform only requires implementing a virtual machine on the new hardware. • Similar to interpreters
Translator for Java Java source code Java bytecode Java interpreter Java compiler Java bytecode Bytecode compiler absolute machine code
Types of Translators • Compilers • Conventional (textual source code) • Imperative, ALGOL-like languages • Other paradigms • Interpreters • Macro processors • Text formatters • Silicon compilers
Types of Translators (cont.) • Visual programming language • Interface • Database • User interface • Operating System
Conventional Translator skeletal source program source program preprocessor compiler target assembly program absolute machine code loader / linker assembler relocatable machine code library, relocatable object files
Structure of Compilers Syntax Analysis (Parser) Lexical Analyzer (scanner) Tokens Source Program Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Target machine code
Structure of Compilers Lexical Analyzer (scanner) Tokens Source Program What about white spaces? Do they matter? int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; int x ; cin >> x ; if ( x > 5 ) cout << “Hello” ; else cout << “BOO” ;
Tokenize First or as needed? int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; Tokens = Meaningful units in a program Value/Type pairs >> ; symbol cin int datatype x ID
Tokenize First or as needed? Array<Array<int>> someArray; int Array < >> Array<Array<int> > someArray; > int > Array <
Structure of Compilers Syntax Analysis (Parser) Lexical Analyzer (scanner) Tokens Source Program Syntactic Structure Parse Tree
Parse Tree (Parser) Program Data Declaration datatype ID >> ; cin int x
Who is responsible for errors? • int x$y; • int 32xy; • 45b • 45ab • x = x @ y; Lexical Errors / Token Errors?
Who is responsible for errors? • X = ; • Y = x +; • Z = [; Syntax errors
Who is responsible for errors? • 45ab • One wrong token? • Two tokens (45 & ab)? Are whitespaces needed? • Either way is okay. • Lexical analyzer can catch the illegal token (45ab) • Parser can catch the syntax error. Most likely 45 followed by ab will not be syntactically correct.
Structure of Compilers Syntax Analysis (Parser) Lexical Analyzer (scanner) Tokens Source Program Syntactic Structure Semantic Analysis int x; cin >> x; if(x>5) x = “SHERRY”; else cout << “BOO”; Symbol Table
Structure of Compilers Syntax Analysis (Parser) Lexical Analyzer (scanner) Tokens Source Program Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Target machine code
Structure of Compilers Front-end Syntax Analysis (Parser) Lexical Analyzer (scanner) Tokens Source Program Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Back-end Target machine code
Translation Steps: • Recognize when input is available. • Break input into individual components. • Merge individual pieces into meaningful structures. • Process structures. • Produce output.
Translation (Compilers) Steps: • Break input into individual components. (lexical analysis) • Merge individual pieces into meaningful structures. (parsing) • Process structures. (semantic analysis) • Produce output. (code generation)
Compilers • Two major tasks: • Analysis of source • Synthesis of target • Syntax-directed translation • Compilation process driven by syntactic structure of the source being translated
Interpreters • Executes source program without explicitly translating to target code. • Control and memory management reside in interpreter, not user program. • Allow: • Modification of program as it executes. • Dynamic typing of variables • Portability • Huge overhead (time & space)
Structure of Interpreters Program Output Interpreter Source Program Data
Misc. Compiler Discussions • History of Modern Compilers • Front and Back ends • One pass vs. Multiple passes • Compiler Construction Tools • Compiler-Compilers, Compiler-generators, Translator-writing Systems • Scanner generator • Parse generator • Syntax-directed engines • Automatic code generator • Dataflow engines