700 likes | 802 Views
Dr Yasser Fouad. STRUCTURE OF PROGRAMMING LANGUAGES. Book. Quote of the Day. “A language that doesn't affect the way you think about programming, is not worth knowing.” - Alan Perlis. Then you decide to get a PhD.
E N D
Dr Yasser Fouad STRUCTURE OF PROGRAMMING LANGUAGES
Quote of the Day “A language that doesn't affect the way you think about programming, is not worth knowing.” - Alan Perlis
Then you decide to get a PhD You get tired of the PowerPoint and its animations. You embed a domain-specific language (DSL) into Ruby. …
Reasons for Studying Concepts of Programming Languages • Increased ability to express ideas • Improved background for choosing appropriate languages • Increased ability to learn new languages • Better understanding of significance of implementation • Overall advancement of computing
How is this class different? It’s about: • foundations of programming langauges • but also how to design your own languages • how to implement them • and about PL tools, such as analyzers • also learn about some classical C.S. algorithms.
Why a developer needs PL New languages will keep coming • Understand them, choose the right one. Write code that writes code • Be the wizard, not the typist. Develop your own language. • Are you kidding? No. Learn about compilers and interpreters. • Programmer’s main tools.
Overview • how many languages does one need? • how many languages did you use? Let’s list them here:
Develop your own language Are you kidding? No. Guess who developed: • PHP • Ruby • JavaScript • perl Done by smart hackers like you • in a garage • not in academic ivory tower Our goal: learn good academic lessons • so that your future languages avoid known mistakes
Programming Domains • Scientific applications • Large number of floating point computations • Fortran • Business applications • Produce reports, use decimal numbers and characters • COBOL • Artificial intelligence • Symbols rather than numbers manipulated • LISP • Systems programming • Need efficiency because of continuous use • C • Web Software • Eclectic collection of languages: markup (e.g., XHTML), scripting (e.g., PHP), general-purpose (e.g., Java)
Figure by Brian Hayes(who credits, in part, Éric Lévénez and Pascal Rigaux):Brian Hayes, “The Semicolon Wars.” American Scientist, July-August 2006, pp.299-303
Programming Paradigms PS — Introduction
Language Categories • Imperative • Central features are variables, assignment statements, and iteration • Examples: C, Pascal • Functional • Main means of making computations is by applying functions to given parameters • Examples: LISP, Scheme • Logic • Rule-based (rules are specified in no particular order) • Example: Prolog • Object-oriented • Data abstraction, inheritance, late binding • Examples: Java, C++ • Markup • New; not a programming per se, but used to specify the layout of information in Web documents • Examples: XHTML, XML
Program • A program is a machine-compatible representation of an algorithm • If no algorithm exists for performing a task, then the task can not be performed by a machine • Programs and algorithms they represent collectively referred to as Software
A formal language for describing computation? A “user interface” to a computer? Syntax + semantics? Compiler, or interpreter, or translator? A tool to support a programming paradigm? What is a Programming Language? A programming language is a notational system for describing computation in a machine-readable and human-readable form. — Louden PS — Introduction
Programming Methodologies Influences • 1950s and early 1960s: Simple applications; worry about machine efficiency • Late 1960s: People efficiency became important; readability, better control structures • structured programming • top-down design and step-wise refinement • Late 1970s: Process-oriented to data-oriented • data abstraction • Middle 1980s: Object-oriented programming • Data abstraction + inheritance + polymorphism
Favorite programming language June 2012 • Python (3,054) • Ruby (1,723) • JavaScript (1,415) • C (970) • C# (829) • PHP (666) • Java (551) • C++ (529) • Haskell (519) • Clojure (459) • CoffeeScript (362) • Objective C (326) • Lisp (322) • Perl (311) • Scala (233) • Scheme (190) • Other (188) • Erlang (162) • Lua (145) • SQL (101)
job listings collected from Dice.com • Python 3,456 (+32.87%) • Ruby 2,141 (+39.03%) • HTML5 (+276.85%) • Flash 1,261 (+95.2%) • Silverlight 865 (-11.91%) • COBOL 656 (-10.75%) • Assembler 209 (-1.42%) • PowerBuilder (-18.71%) • FORTRAN 45 (-33.82%) • Java 17,599 (+8.96%) • XML 10,780 (+11.70%) • JavaScript (+11.64%) • HTML 9,587 (-1.53%) • C# 9,293 (+17.04%) • C++ 6,439 (+7.55%) • AJAX 5,142 (+15.81%) • Perl 5,107 (+3.21%) • PHP 3,717 (+23%)
A Brief Chronology PS — Introduction
ENIAC (1946, University of Philadelphia) ENIAC program for external ballistic equations:
ENIAC (1946, University of Philadelphia) programming done by • rewiring the interconnections • to set up desired formulas, etc Problem (what’s the tedious part?) • programming = rewiring • slow, error-prone solution: • store the program in memory! • birth of von Neuman paradigm
Assembly – the language (UNIVAC 1, 1950) Idea: mnemonic (assembly) code • Then translate it to machine code by hand (no compiler yet) • write programs with mnemonic codes (add, sub), with symbolic labels, • then assign addresses by hand Example of symbolic assembler clear-and-add a add b store c translate it by hand to something like this (understood by CPU) B100 A200 C300
Assembly Language ADDI R4 R2 21 ADDI R4,R2,21 10101100100000100000000000010101 • Use symbols instead of binary digits to describe fields of instructions. • Every aspect of machine visible in program: • One statement per machine instruction. • Register allocation, call stack, etc. must be managed explicitly. • No structure: everything looks the same.
Assembler – the compiler (Manchester, 1952) • a loop example, in MIPS, a modern-day assembly code: loop: addi $t3, $t0, -8 addi $t4, $t0, -4 lw $t1, theArray($t3) # Gets the last lw $t2, theArray($t4) # two elements add $t5, $t1, $t2 # Adds them together... sw $t5, theArray($t0) # ...and stores the result addi $t0, $t0, 4 # Moves to next "element“ # of theArray blt $t0, 160, loop # If not past the end of # theArray, repeat jr $ra
High-level Language • Provides notation to describe problem solving strategies rather than organize data and instructions at machine-level. • Improves programmer productivity by supporting features to abstract/reuse code, and to improve reliability/robustness of programs. • Requires a compiler.
FORTRAN I (1954-57) Langauge, and the first compiler • Produced code almost as good as hand-written • Huge impact on computer science • Modern compilers preserve its outlines By 1958, >50% of all software is in FORTRAN
FORTRAN I Example: nested loops in FORTRAN • a big improvement over assembler, • but annoying artifacts of assembly remain: • labels and rather explicit jumps (CONTINUE) • lexical columns: the statement must start in column 7 • The MIPS loop from previous slide, in FORTRAN: DO 10 I = 2, 40 A[I] = A[I-1] + A[I-2] 10 CONTINUE
“Hello World” in FORTRAN All examples from the ACM "Hello World" project: www2.latech.edu/~acm/HelloWorld.shtml PROGRAM HELLO DO 10, I=1,10 PRINT *,'Hello World' 10 CONTINUE STOP END PS — Introduction
Side note: designing a good language is hard Good language protects against bugs, but lessons take a while. An example that caused a failure of a NASA planetary probe: buggy line: DO 15 I = 1.100 what was intended (a dot had replaced the comma): DO 15 I = 1,100 because Fortran ignores spaces, compiler read this as: DO15I = 1.100 which is an assignment into a variable DO15I, not a loop. This mistake is harder to make (if at all possible) with the modern lexical rules (white space not ignored) and loop syntax for (i=1; i < 100; i++) { … }
“Hello World” in COBOL 000100 IDENTIFICATION DIVISION. 000200 PROGRAM-ID. HELLOWORLD. 000300 DATE-WRITTEN. 02/05/96 21:04. 000400* AUTHOR BRIAN COLLINS 000500 ENVIRONMENT DIVISION. 000600 CONFIGURATION SECTION. 000700 SOURCE-COMPUTER. RM-COBOL. 000800 OBJECT-COMPUTER. RM-COBOL. 001000 DATA DIVISION. 001100 FILE SECTION. 100000 PROCEDURE DIVISION. 100200 MAIN-LOGIC SECTION. 100300 BEGIN. 100400 DISPLAY " " LINE 1 POSITION 1 ERASE EOS. 100500 DISPLAY "HELLO, WORLD." LINE 15 POSITION 10. 100600 STOP RUN. 100700 MAIN-LOGIC-EXIT. 100800 EXIT. PS — Introduction
ALGOL 60 History • Committee of PL experts formed in 1955 to design universal, machine-independent, algorithmic language • First version (ALGOL 58) never implemented; criticisms led to ALGOL 60 Innovations • BNF (Backus-Naur Form) introduced to define syntax (led to syntax-directed compilers) • First block-structured language; variables with local scope • Structured control statements • Recursive procedures • Variable size arrays Successes • Highly influenced design of other PLs but never displaced FORTRAN PS — Introduction
“Hello World” in BEALGOL BEGIN FILE F (KIND=REMOTE); EBCDIC ARRAY E [0:11]; REPLACE E BY "HELLO WORLD!"; WHILE TRUE DO BEGIN WRITE (F, *, E); END; END. PS — Introduction
“Hello World” in PL/1 HELLO: PROCEDURE OPTIONS (MAIN); /* A PROGRAM TO OUTPUT HELLO WORLD */ FLAG = 0; LOOP: DO WHILE (FLAG = 0); PUT SKIP DATA('HELLO WORLD!'); END LOOP; END HELLO; PS — Introduction
“Hello World” in Functional Languages SML Haskell print("hello world!\n"); hello() = print "Hello World" PS — Introduction
Goto considered harmful L1: statement if expression goto L1 statement Dijkstra says: gotos are harmful • use structured programming • lose some performance, gain a lot of readability how do you rewrite the above code into structured form?
Special-Purpose Languages SNOBOL • First successful string manipulation language • Influenced design of text editors more than other PLs • String operations: pattern-matching and substitution • Arrays and associative arrays (tables) • Variable-length strings ... OUTPUT = 'Hello World!' END PS — Introduction
Object-Oriented Languages History • Simula was developed by Nygaard and Dahl (early 1960s) in Oslo as a language for simulation programming, by adding classes and inheritance to ALGOL 60 • Smalltalk was developed by Xerox PARC (early 1970s) to drive graphic workstations Begin while 1 = 1 do begin outtext ("Hello World!"); outimage; end; End; Transcript show:'Hello World';cr PS — Introduction
4GLs “Problem-oriented” languages • PLs for “non-programmers” • Very High Level (VHL) languages for specific problem domains Classes of 4GLs (no clear boundaries) • Report Program Generator (RPG) • Application generators • Query languages • Decision-support languages Successes • Highly popular, but generally ad hoc PS — Introduction
“Hello World” in SQL CREATE TABLE HELLO (HELLO CHAR(12)) UPDATE HELLO SET HELLO = 'HELLO WORLD!' SELECT * FROM HELLO PS — Introduction
Scripting Languages History Countless “shell languages” and “command languages” for operating systems and configurable applications • Unix shell (ca. 1971) developed as user shell and scripting tool • HyperTalk (1987) was developed at Apple to script HyperCard stacks • TCL (1990) developed as embedding language and scripting language for X windows applications (via Tk) • Perl (~1990) became de facto web scripting language echo "Hello, World!" on OpenStack show message box put "Hello World!" into message box end OpenStack puts "Hello World " print "Hello, World!\n"; PS — Introduction
How do Programming Languages Differ? Common Constructs: • basic data types (numbers, etc.); variables; expressions; statements; keywords; control constructs; procedures; comments; errors ... Uncommon Constructs: • type declarations; special types (strings, arrays, matrices, ...); sequential execution; concurrency constructs; packages/modules; objects; general functions; generics; modifiable state; ... PS — Introduction
Improved background for choosing appropriate languages • C vs. Modula-3 vs. C++ for systems programming • Fortran vs. APL vs. Ada for numerical computations • Ada vs. Modula-2 for embedded systems • Common Lisp vs. Scheme vs. Haskell for symbolic data manipulation • Java vs. C/CORBA for networked PC programs
Evolution of Programming Languages • ALGOL - 60 (ALGOrithmic Language) Goals : Communicating Algorithms Features : Block Structure (Top-down design) Recursion (Problem-solving strategy) BNF - Specification • LISP (LISt Processing) Goals : Manipulating symbolic information Features : List Primitives Interpreters / Environment