550 likes | 785 Views
E6998-2: Advanced Topics in Programming Languages and Compilers . Alfred V. Aho aho@cs.columbia.edu. Lecture 1 – Introduction to Course September 6, 2011. Introduction. Professor Alfred V. Aho http://www.cs.columbia.edu/~aho aho@cs.columbia.edu Lectures: Tuesdays, 4:10-6:00pm, 337 Mudd
E N D
E6998-2: Advanced Topics inProgramming Languages and Compilers Alfred V. Aho aho@cs.columbia.edu Lecture 1 – Introduction to Course September 6, 2011
Introduction Professor Alfred V. Aho http://www.cs.columbia.edu/~aho aho@cs.columbia.edu Lectures: Tuesdays, 4:10-6:00pm, 337 Mudd Office hours: Tuesdays 2:45-3:45pm, 513 Computer Science Building Course webpage: http://www.cs.columbia.edu/~aho/cs6998
Course Objectives • Understanding how modern language and compiler technology can be used to make more reliable software • Learning the major concepts and design principles underlying programming languages • Understanding modern program analysis techniques and tools • Awareness of language and compiler issues in dealing with parallelism and concurrency • A highlight of this course is a semester-long project in which you can explore an advanced topic of your own choosing in more depth.
Potential Project Topics • Computational thinking in language design • Lambda calculus and functional languages • Concurrency and parallelism • Program analysis techniques • Interprocedural analysis • Pointer analysis • Binary decision diagrams • Model checking • Satisfiability modulo theory solvers • Abstract interpretation • Report on a “most influential PLDI paper” • http://www.sigplan.org/award-pldi.htm
Recent Most Influential PLDI Papers • Automatic predicate abstraction of C programs • Dynamo: A Transparent Dynamic Optimization System • A Fast Fourier Transform Compiler • The Implementation of the Cilk-5 Multithreaded Language • Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling • TIL: A Type-Directed Optimizing Compiler for ML • Selective Specialization for Object-Oriented Languages • ATOM: a system for building customized program analysis tools • Space Efficient Conservative Garbage Collection • Lazy Code Motion • A data locality optimizing algorithm • Profile guided code positioning [http://www.sigplan.org/award-pldi.htm]
Additional Project Topics • Garbage collection • Data-flow analysis schemas • Instruction-level parallelism • Optimizing for parallelism and locality • Interprocedural analysis • New intermediate representations • Static single-assignment form • New compiler development tools • Phoenix • Compiler collections • GCC • LLVM • .NET
Prerequisites and Background Text • Fluency in at least one major programming language such as C, C++, C#, Java, or OCaml • COMS W4115: Programming Languages and Translators, or equivalent • Text: Compilers, Techniques, and Tools (Second Edition), Aho, Lam, Sethi, and Ullman, Addison-Wesley, 2007.
Course Project and Grade • Each student should select a programming language or compiler topic to pursue in more depth. • Students will regularly discuss their projects with the class and, at the end of the semester, will present their project and hand in a final project report summarizing their findings. • The project and classroom discussions will determine the final grade.
The Age of Computational Thinking Computational advertising Computational biology Computational chemistry Computational legal studies Computational linguistics Computational physics Computational science Computational thinking in programming language design
Computational Thinking – Jeannette Wing Computational thinking is a fundamental skill for everyone, not just for computer scientists. To reading, writing, and arithmetic, we should add computational thinking to every child’s analytical ability. Just as the printing press facilitated the spread of the three Rs, what is appropriately incestuous about this vision is that computing and computers facilitate the spread of computational thinking. Computational thinking involves solving problems, designing systems, and understanding human behavior, by drawing on the concepts fundamental to computer science. Computational thinking includes a range of mental tools that reflect the breadth of the field of computer science. [Jeannette Wing, Computational Thinking, CACM, March, 2006]
Computational Thinking The thought processes involved in formulating problems so their solutions can be represented as computation steps and algorithms. A. V. Aho Computation and Computational Thinking Ubiquity Symposium, ACM, 2010 http://ubiquity.acm.org/symposia.cfm
A Good Way to Learn Computational Thinking Design and implement your own programming language!
Computational Thinking in Language Design Problem Domain Conceptual Formulation Algorithms+ Computational Model Programming Language
1971 Fortran Lisp Cobol Algol60 APL Snobol4 Simula67 Basic PL/1 Pascal 2011 Java C C++ PHP C# Objective-C Basic Python Perl JavaScript [http://www.tiobe.com, September 2011] Evolution of Programming Languages
Issues in Language Design • Domain of application • exploit domain restrictions for expressiveness, performance • Computational model • simplicity, ease of expression • Abstraction mechanisms • reuse, suggestivity • Type system • reliability, security • Usability • readability, writability, efficiency
Computational Thinking in Language Design Underlying every programming language is a model of computation: C, C++, C#, Java: Von Neumann SQL: Relational algebra Prolog: Logic Haskell, ML: Lambda calculus
Computational Model of AWK AWK is a simple language designed to perform routine data-processing tasks on strings and numbers Problem: given a list of name-value pairs, print the total value associated with each name. An AWK program is a sequence of pattern-action statements alice 10 eve 20 bob 15 alice 30 { total[$1] += $2 } END { for (x in total) print x, total[x] } eve 20 bob 15 alice 40
Theory in Practice: Regular Expression Pattern Matching in Perl, Python, Ruby vs. AWK Time to check whether a?nanmatches an regular expression and text size n Russ Cox, Regular expression matching can be simple and fast (but is slow in Java, Perl, PHP, Python, Ruby, ...) [http://swtch.com/~rsc/regexp/regexp1.html, 2007]
Evolutionary Forces on Languages and Compilers Increasing diversity of applications Stress on increasing productivity Need to improve software reliability and maintainability Target machines more diverse Parallel machine architectures Massive compiler collections
Target Languages Another programming language CISCs RISCs Parallel machines Multicores GPUs Quantum computers
Programming Languages Today Today there are thousands of programming languages. The website http://www.99-bottles-of-beer.net has programs in 1,434 different programming languages to print the lyrics to the song “99 Bottles of Beer.”
“99 Bottles of Beer” 99 bottles of beer on the wall, 99 bottles of beer. Take one down and pass it around, 98 bottles of beer on the wall. 98 bottles of beer on the wall, 98 bottles of beer. Take one down and pass it around, 97 bottles of beer on the wall. . . . 2 bottles of beer on the wall, 2 bottles of beer. Take one down and pass it around, 1 bottle of beer on the wall. 1 bottle of beer on the wall, 1 bottle of beer. Take one down and pass it around, no more bottles of beer on the wall. No more bottles of beer on the wall, no more bottles of beer. Go to the store and buy some more, 99 bottles of beer on the wall. [Traditional]
“99 Bottles of Beer” in AWK BEGIN { for(i = 99; i >= 0; i--) { print ubottle(i), "on the wall,", lbottle(i) "." print action(i), lbottle(inext(i)), "on the wall." print } } function ubottle(n) { return sprintf("%s bottle%s of beer", n ? n : "No more", n - 1 ? "s" : "") } function lbottle(n) { return sprintf("%s bottle%s of beer", n ? n : "no more", n - 1 ? "s" : "") } function action(n) { return sprintf("%s", n ? "Take one down and pass it around," : \ "Go to the store and buy some more,") } function inext(n) { return n ? n - 1 : 99 } [Osamu Aoki, http://people.debian.org/~osamu]
“99 Bottles of Beer” in Perl ''=~( '(?{' .('`' |'%') .('[' ^'-') .('`' |'!') .('`' |',') .'"'. '\\$' .'==' .('[' ^'+') .('`' |'/') .('[' ^'+') .'||' .(';' &'=') .(';' &'=') .';-' .'-'. '\\$' .'=;' .('[' ^'(') .('[' ^'.') .('`' |'"') .('!' ^'+') .'_\\{' .'(\\$' .';=('. '\\$=|' ."\|".( '`'^'.' ).(('`')| '/').').' .'\\"'.+( '{'^'['). ('`'|'"') .('`'|'/' ).('['^'/') .('['^'/'). ('`'|',').( '`'|('%')). '\\".\\"'.( '['^('(')). '\\"'.('['^ '#').'!!--' .'\\$=.\\"' .('{'^'['). ('`'|'/').( '`'|"\&").( '{'^"\[").( '`'|"\"").( '`'|"\%").( '`'|"\%").( '['^(')')). '\\").\\"'. ('{'^'[').( '`'|"\/").( '`'|"\.").( '{'^"\[").( '['^"\/").( '`'|"\(").( '`'|"\%").( '{'^"\[").( '['^"\,").( '`'|"\!").( '`'|"\,").( '`'|(',')). '\\"\\}'.+( '['^"\+").( '['^"\)").( '`'|"\)").( '`'|"\.").( '['^('/')). '+_,\\",'.( '{'^('[')). ('\\$;!').( '!'^"\+").( '{'^"\/").( '`'|"\!").( '`'|"\+").( '`'|"\%").( '{'^"\[").( '`'|"\/").( '`'|"\.").( '`'|"\%").( '{'^"\[").( '`'|"\$").( '`'|"\/").( '['^"\,").( '`'|('.')). ','.(('{')^ '[').("\["^ '+').("\`"| '!').("\["^ '(').("\["^ '(').("\{"^ '[').("\`"| ')').("\["^ '/').("\{"^ '[').("\`"| '!').("\["^ ')').("\`"| '/').("\["^ '.').("\`"| '.').("\`"| '$')."\,".( '!'^('+')). '\\",_,\\"' .'!'.("\!"^ '+').("\!"^ '+').'\\"'. ('['^',').( '`'|"\(").( '`'|"\)").( '`'|"\,").( '`'|('%')). '++\\$="})' );$:=('.')^ '~';$~='@'| '(';$^=')'^ '[';$/='`'; [Andrew Savage, http://search.cpan.org/dist/Acme-EyeDrops/lib/Acme/EyeDrops.pm]
“99 Bottles of Beer” in the Whitespace Language [Andrew Kemp, http://compsoc.dur.ac.uk/whitespace/]
Conlangs: Made-Up Languages Okrent lists 500 invented languages including: • Lingua Ignota [Hildegaard of Bingen, c. 1150] • Esperanto [L. Zamenhof, 1887] • Klingon [M. Okrand, 1984] Huq Us'pty G'm (I love you) • Proto-Central Mountain [J. Burke, 2007] • Dritok [D. Boozer, 2007] Language of the Drushek, long-tailed beings with large ears and no vocal cords [Arika Okrent, In the Land of Invented Languages, 2009] [http://www.inthelandofinventedlanguages.com]
S a S b S ε a S b S ε ε Grammars are Used for Specifying Syntax The grammar S→ aSbS | bSaS | εgenerates all strings of a’s and b’s with the same number of a’s as b’s. This grammar is ambiguous: abab has two parse trees. S a S b S b S a S ε ε ε (ab)n hasparse trees
Programming Languages are notInherently Ambiguous This grammar G generates the same language S→ aAbS | bBaS | ε A→ aAbA | ε B→ bBaB | ε G is unambiguous and has only one parse tree for every sentence in L(G). S a A b S ε a A b S ε ε
Natural Languages are Inherently Ambiguous I made her duck. [5 meanings: D. Jurafsky and J. Martin, 2000] One morning I shot an elephant in my pajamas. How he got into my pajamas I don’t know. [Groucho Marx, Animal Crackers, 1930] List the sales of the products produced in 1973 with the products produced in 1972. [455 parses: W. Martin, K. Church, R. Patil, 1987]
Methods for Specifying the Semantics ofProgramming Languages Operational semantics Program constructs are translated to an understoodlanguage. Axiomatic semantics Assertions called preconditions and postconditions specify the properties of statements. Denotational semantics Semantic functions map syntactic objects to semantic values.
Phases of a Compiler source program target program Lexical Analyzer Syntax Analyzer Semantic Analyzer Interm. Code Gen. Code Optimizer Code Gen. token stream syntax tree annotated syntax tree interm. rep. interm. rep. Symbol Table [A. V. Aho, M. S. Lam, R. Sethi, J. D. Ullman, Compilers: Principles, Techniques, & Tools, 2007]
Compiler Component Generators lex specification yacc specification Lexical Analyzer Generator (lex) Syntax Analyzer Generator (yacc) Lexical Analyzer Syntax Analyzer syntax tree source program token stream
Lex Specification for a Desk Calculator number [0-9]+\.?|[0-9]*\.[0-9]+ %% [ ] { /* skip blanks */ } {number} { sscanf(yytext, "%lf", &yylval); return NUMBER; } \n|. { return yytext[0]; } [M. E. Lesk and E. Schmidt, Lex – A Lexical Analyzer Generator]
Yacc Specification for a Desk Calculator %token NUMBER %left '+' %left '*' %% lines : lines expr '\n' { printf("%g\n", $2); } | /* empty */ ; expr : expr '+' expr { $$ = $1 + $3; } | expr '*' expr { $$ = $1 * $3; } | '(' expr ')' { $$ = $2; } | NUMBER ; %% #include "lex.yy.c" [Stephen C. Johnson, Yacc: Yet Another Compiler-Compiler ]
Creating the Desk Calculator Invoke the commands lex desk.l yacc desk.y cc y.tab.c –ly –ll Result Desk Calculator 1.2 * (3.4 + 5.6) 10.8
Computational Thinking for Quantum Computing Physical System Mathematical Formulation Algorithms Model of Computation Subatomic particles Quantum mechanics Shor, Grover, etc. Quantum circuits
Computational Thinking for Quantum Computing The Four Postulates of Quantum Mechanics M. A. Nielsen and I. L. Chuang Quantum Computation and Quantum Information Cambridge University Press, 2000
State Space Postulate Postulate 1 The state of an isolated quantum system can be described by a unit vector in a complex Hilbert space.
Qubit: Quantum Bit • The state of a quantum bit can be described by a unit vector in a 2-dimensional complex Hilbert space (in Dirac notation) where αand βare complex coefficients called the amplitudes of the basis statesand , and • In linear algebra
Time-Evolution Postulate Postulate 2 The evolution of a closed quantum system can be described by a unitary operatorU. (An operator U is unitary if U†U = I.) U state of the system at time t1 state of the system at time t2
Useful Quantum Operators: Hadamard Operator The Hadamard operator has the matrix representation Hmaps the computational basis states as follows Note that HH = I.
Composition-of-Systems Postulate Postulate 3 The state space of a combined physical system is the tensor product space of the state spaces of the component subsystems. If one system is in the state and another is in the state , then the combined system is in the state . is often written as or as .
. c c t Useful Quantum Operators: the CNOT Operator The two-qubit CNOT (controlled-NOT) operator has the matrix representation: CNOT flips the target bit t iff the control bit c has the value 1: The CNOT gate maps
Measurement Postulate Postulate 4 Quantum measurements can be described by a collection of operators acting on the state space of the system being measured. If the state of the system is before the measurement, then the probability that the result moccurs is and the state of the system after measurement is
Measurement The measurement operators satisfy the completeness equation: The completeness equation says the probabilities sum to one:
Computational Abstraction: Quantum Circuits Quantum circuit to create Bell (Einstein-Podulsky-Rosen) states: Circuit maps Output is an entangled state, one that cannot be written in a product form. (Einstein: “Spooky action at a distance.”) x H y
Alice and Bob’s Qubit-State Delivery Problem • Alice knows that she will need to send to Bob the state of an important secret qubit sometime in the future. • Her friend Bob is moving far away and will have a very low bandwidth Internet connection. • Therefore Alice will need to send her qubit state to Bob cheaply. • How can Alice and Bob solve their problem?
Alice and Bob’s Solution: Quantum Teleportation! • Alice and Bob generate an EPR pair . • Alice takes one half of the pair; Bob the other half. Bob moves far away. • Alice interacts her secret qubit with her EPR-half and measures the two qubits. • Alice sends the two resulting classical measurement bits to Bob. • Bob decodes his half of the EPR pair using the two bits to discover . H M1 M2 Z X
Quantum Computer Architecture Knill [1996]: Quantum RAM, a classical computer combined with a quantum device with operations for initializing registers of qubits and applying quantum operations and measurements Quantum Logic Unit Quantum Memory Classical Computer E. Knill Conventions for Quantum Pseudocode Los Alamos National Laboratory, LAUR-96-2724, 1996
quantum circuit quantum circuit quantum device Quantum Computer Compiler QIR: quantum intermediate representation QASM: quantum assembly language QPOL: quantum physical operations language quantum source program QIR QASM QPOL Technology Independent CG+Optimizer Technology Dependent CG+Optimizer Front End Technology Simulator quantum mechanics Computational abstractions K. Svore, A. Aho, A. Cross, I. Chuang, I. Markov A Layered Software Architecture for Quantum Computing Design Tools IEEE Computer, 2006, vol. 39, no. 1, pp.74-83