160 likes | 238 Views
Introduction to Compilers. Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY. Computers run 0/1 strings (machine language program). 0010001000000100 0010010000000100 0001011001000010 0011011000000011 1111000000100101 0000000000000101 0000000000000110 0000000000000000.
E N D
Introduction to Compilers Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY
Computers run 0/1 strings(machine language program) 0010001000000100 0010010000000100 0001011001000010 0011011000000011 1111000000100101 0000000000000101 0000000000000110 0000000000000000 • A machine language program that adds two numbers • First 4 bits for opcode • Last 12 bits for operands • Source: • Louden and Lambert’s book: • Programming Languages
Programmers write more readable character strings An assembly language program that adds two numbers, from Louden and Lambert’s book.
Even more readable character strings: high-level languages • Imperative Languages: specifies HOW • Fortran • ALGOL • PASCAL • C • C++ • Java • Declarative Languages: specifies WHAT • SQL, ML, Prolog
Models of Computation in Languages Underlying most programming languages is a model of computation: Procedural: Fortran (1957) Functional: Lisp (1958) Object oriented: Simula (1967) Logic: Prolog(1972) Relational algebra: SQL (1974) Source: A. V. Aho. Lectures of Programming Languages and Translators
What is a compiler? • A Compiler is a translator • between computers and programmers • More generally speaking, a Compiler is a translator between source strings and target strings. • between assembly language and Fortran • between Java and Java Bytecode • between Java and SQL
Assembly language vs Fortran Source: Stephen A. Edwards. Lectures of Programming Languages and Translators
The Structure of a Compiler • Lexical Analysis • Syntax Analysis (or Parsing) • Semantic Analysis • Intermediate Code Generation • Code Optimization • Code Generation
sname rating > 5 bid=100 sid=sid Sailors Reserves Translation of SQL query • Query can be converted to relational algebra • Relational Algebra converts to tree, joins form branches • Each operator has implementation choices • Operators can also be applied in different order! SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 (sname)(bid=100 rating > 5) (Reserves Sailors)
Catalog Manager Cost-based Query Sub-System Select * From Blah B Where B.blah = blah Queries Usually there is a heuristics-based rewriting step before the cost-based steps. Query Parser Query Optimizer Plan Generator Plan Cost Estimator Schema Statistics Query Executor
(On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Page-Oriented Nested loops) sid=sid Reserves Sailors Motivating Example SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 • Cost: 500+500*1000 I/Os • By no means the worst plan! • Misses several opportunities: • selections could be`pushed’ down • no use made of indexes • Goal of optimization: Find faster plans that compute the same answer. Plan:
(On-the-fly) sname (On-the-fly) bid=100 (On-the-fly) sname (Page-Oriented Nested loops) sid=sid (On-the-fly) rating > 5 bid=100 rating > 5 (On-the-fly) Reserves (Page-Oriented Nested loops) Sailors sid=sid Reserves Sailors Alternative Plans – Push Selects (No Indexes) 500,500 IOs 250,500 IOs
(On-the-fly) sname (On-the-fly) bid=100 (On-the-fly) sname (Page-Oriented Nested loops) sid=sid (Page-Oriented Nested loops) sid=sid rating > 5 (On-the-fly) Reserves bid = 100 rating > 5 Sailors (On-the-fly) (On-the-fly) Sailors Reserves Alternative Plans – Push Selects (No Indexes) 4250 IOs 250,500 IOs 500 + 1000 + 250 + 250*10