230 likes | 418 Views
Programming Languages: Design, Specification, and Implementation. G22.2210-001 Rob Strom December 7, 2006. Programming Languages Core Exam. Syntactic issues: regular expressions, context-free grammars (CFG), BNF. Imperative languages: program organization, control structures, exceptions
E N D
Programming Languages:Design, Specification, and Implementation G22.2210-001 Rob Strom December 7, 2006
Programming Languages Core Exam • Syntactic issues: regular expressions, context-free grammars (CFG), BNF. • Imperative languages: program organization, control structures, exceptions • Types in imperative languages: strong typing, type equivalence, unions and discriminated types in C and Ada. • Block structure, visibility and scoping issues, parameter passing. • Systems programming and weak typing: exposing machine characteristics, type coercion, pointers & arrays in C. • Run-time organization of block-structured languages: static scoping, activation records, dynamic and static chains, displays. • Programming in the large: abstract data types, modules, packages and namespaces in Ada, Java, and C++. • Functional programming: list structures, higher order functions, lambda expressions, garbage collection, metainterpreters in Lisp and Scheme. Type inference and ML. • Object-Oriented programming: classes, inheritance, polymorphism, dynamic dispatching. Constructors, destructors and multiple inheritance in C++, interfaces in Java. • Generic programming: parametrized units and classes in C++, Ada and Java. • Concurrent programming: threads and tasks, communication, race conditions and deadlocks, protected methods and types in Ada and Java.
Regular expressions: http://www.regular-expressions.info/ Prolog: Giannesini et al, “Prolog”, Addison-Wesley 1986. SQL: C.J. Date, “An Introduction to Database Systems”, Addison-Wesley 2000, chapters 3, 4. Hermes: Strom et al: “Hermes: A Language for Distributed Computing”, Prentice-Hall, 1991. Java RMI: http://java.sun.com/docs/books/tutorial/rmi/overview.html Readings (optional)
O3 O6 O9 O5 O8 O10 Guava’s Component Model M1 M2 V1 V2 O1 X V11 O4 X O7 O2
Referenced Monitors • + always synchronized • Referenced Objects • + never synchronized • – restricted references • Values • + may be user-definable • + classes/methods • + move, copy-by-value Java and Guava types • Referenced Objects • classes-methods • access by reference • synchronized or not • Values • built in – primitive types • no classes/methods • no references/sharing • copy by value
Guava changes: summary Instance toString clone hashCode equals getClass • – Annotations • synchronized • volatile • + Annotations • [read], update • [local], global • lent, kept, in Region, new Local Mobile copy Reference Value Object Monitor wait notify finalize notifyAll
Y: BucketList[] X: BucketList[] Y[0] X[0] A 3 . A 3 . D 9 . D 9 . E . E . Y[1] X[1] B 3 . B 3 . C 8 . C 8 . B 3 . C 8 . Z Monitors, Values, Objects M1 An object has at most one owning monitor/value M2 M3 G 1 . G 1 . G 1 . Y[1]
M1 X M2.foo(X); M2 Y Z Region Analysis: lent, kept M1 X class M2type extends Monitor { . . . void foo (lent Bucket P1); } class Ztype extends Object { . . . void bar (lent Bucket P1, kept Bucket P2); } M2.foo(X); M2 this.N = P1; Z.bar(P1,Y); Y • lent = An unknown region • (Default for parameters of non-objects) • kept = Same region as this • (Default for parameters of objects) Z.bar(Y,P1); this.A = P1; this.A = P2; P1.op(); P2.N = this; Z
A C D Region Analysis: new, in R M1 E class M2type extends Monitor { . . . new Bucket m (Bucket P1 in R1, Bucket P2 in R1, Bucket P3 in R2, Bucket P4 in R2); } E B E = M2.m( A,B,C,D); M2 P1.N = P2; P4.N = P3; P2.N = P4; return new Bucket ( 3, null); • new = No region • in R = Same region as other parameters labeled in R
Other paradigms • String processing – lexx, yacc, SableCC, regular expressions • Logic programming – Prolog • Transactional Programming – CLU, SQL, XQuery • Distributed – Hermes, Java (and other) RMI tools
Regular expressions • Distinguish between the syntax of the pattern, and the semantics. Different engines will have slightly different syntax. • A regular expression is a • pattern that you apply to • a text, in order to • determine a match, and sometimes to • parse the components of the match (e.g. for search/replacement) • A regular language is one that can be parsed with a regular expression
Patterns • Exact character: a • Any character: . • Zero or more repetitions of a pattern: * • (Patterns can be grouped with parentheses) • Concatenation: (pat1)(pat2) • Alternation: (pat1)|(pat2) • Zero or 1: x? (same as (x)|()) • One or more: x+ (same as x(x)*) • Any character (not) in the set: [a-z] [^a-z] • Other special matches, that match positions rather than characters: e.g. ^ (start of line) $ (end of line), \b (word) • Modern extensions: {x, y} between x and y occurrences
Examples • .*\.txt matches a filename ending with .txt • \b[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b matches an email address, like robstrom@us.ibm.com • <[A-Za-z][A-Za-z0-9]*> matches a tag like <Foo>
More examples • <.+> applied to the string • Here is a test: <test> hello </test> now what? • Matches <test> hello </test> • That’s because + and * are greedy • Suppose you wanted to match only <test> • You should do one of the following: • Select lazy matching: <.+?> • Replace dot with something more restrictive, e.g. <[^<>]+>
Logic Programming – Prolog (from Gelernter et al.) • Facts, relationships that are true • female(annette). • female(marilyn). • female(audrey). • parents(annette, fred, marilyn). • parents(audrey, fred, marilyn). • parents(marilyn, john, liz). • Rules, inferences that can be made: • sister(X,Y) :- female(X), female(Y), parents(X,M,F), parents(Y,M,F) – two females having the same parents are sisters • Queries, • ? :- sister(annette, audrey). – are annette and audrey sisters? • ? :- sister(_, audrey). – does audrey have a sister? • ? :- sister(X, audrey). – for what X are X and audrey sisters (i.e., who are audrey’s sisters)
Examples: • Data types: integers, strings, lists. • Append example: • append([], L, L). – an empty list appended by L yields L • append([X|L1], L2, [X|L3]) :- append(L1,L2,L3) – if given that L3=L1 appended by L2 then X followed by L3 is X followed by L1 appended by L2 • ? :- append([a,b,c],[d,e,f],X) asks what is [a,b,c] appended by [d,e,f] – answer is [a,b,c,d,e,f]. • ? :- append([a,b,c], X, [a,b,c,d,e,f]) asks what should I append [a,b,c] by to get [a,b,c,d,e,f]. Notice the query is just as easy to express.
How does it work? • Our old friend, unification! • Remember, that means, find a substitution of variables that makes two expressions the same. • A goal is satisfied if: • A fact unifies with the goal • A rule’s head unifies with the goal and then each clause in that rule’s body is satisfied (these clauses then recursively become goals). • If there are multiple possible unifications, then they must all be tried, backtracking on failure. • To speed up search, and to inhibit multiple paths, Prolog introduces a cut operator. The trouble with “cut” is that it spoils the abstraction of Prolog: the user has to know the order in which goals are evaluated.
Transactional Programming with SQL • Actually transactions and SQL are logically independent • Transactions means executing a set of operations (reads and/or updates) atomically. • Remember that: Atomically means that if T1 includes [a, b, c] and T2 includes [d, e, f], the only possible results are [a,b,c,d,e,f] and [d,e,f,a,b,c] and no interleavings. • Whereas SQL means using a relational model of data whether transactional or not.
What’s a relational database? • A collection of tables, each representing a relation. • Each table’s schema defines • The names and types of each column, e.g. DeptName (String), Budget (Float), Manager (String) • Some integrity constraint, e.g. Each department Name has exactly one Budget and one Manager. • Viewed as a collection of rows, each row having one entry in each column • Each row stands for a fact, e.g. “The Marketing Department has a budget of $20M and is managed by Slick”.
What is the SQL query language? • A declarative way of extracting derived facts, e.g. • SELECT (Employee, Salary) FROM (Departments JOIN Employees OVER DeptName) WHERE(Manager = ‘Slick’) (Tell me the names and salaries of all employees who work for Slick.) • It hides: • How data is organized (hashtables, indexes, trees, etc.) • How the database is navigated to answer the query • SQL has a query subset which is purely declarative, plus imperative operations that can be used within transactions. • Operations: • Select (also called “restrict”) – filter rows from a table • Project – select certain columns from a table • Join – combines facts from multiple tables based upon some common column(s) • Others: -- e.g. top-K
Distributed Languages • These are languages dealing with multiple relatively independent systems that interact • Sometimes by message passing • Sometimes by remote procedure calls or remote method invocations • What they have in common is local-remote transparency • This means that the syntax and semantics for an interaction between two modules is the same regardless of whether the communicating modules are located on the same machine/process or on different machines/processes.
Hermes • Modular unit is the process • A process behaves like an instance of an Ada task type, but • Processes never leave the module • They communicate by • First, establishing connections between their output ports and another process’s input ports • Then they either send messages on their output ports – they get queued up at the input port and eventually received • Or else, they send call messages on their output ports – these behave like Ada rendezvous calls or Java synchronized method calls; they queue up and get accepted one at a time, but the caller blocks until the call message is returned. • There are no references. Only processes, values, and ports. Everything is passed by value. Inout parameters of calls are passed by value/result. • There are no global variables. The only way a process can talk to something outside itself is via a port. It starts out getting any ports its parent passed it in the constructor, and it exports back to its parent a connection to itself. • Ports are first-class. For example, I can call a service (e.g. a file factory) and receive back a port that lets me send things to the file. • The implementation of Hermes hides from the user whether the process at the other end of an output port is actually running in the same machine or in a different machine. Only the performance is different. • As in all remote procedure call systems, remote ports are implemented via proxy objects. In Hermes these proxies are managed 100% by the runtime.
Java RMI This is a problem with any language that has pointers. If an object refers to another, and I pass an object do I mean the object and everything it points to (which might be the world) or just those parts of it that represent its value? • Almost a transparent system, but not quite! • The most important differences: • You have to declare interfaces remote, and the operations have to throw RemoteException • Classes being passed must implement Serializable • References not declared remote are passed by deep copying; those declared remote are passed by reference; those declared static or transient are not passed at all! • Parameters passed by value cannot be returned! • You need to compile stubs • Servers and clients are asymmetric, and servers need to run security managers. package compute; import java.rmi.Remote; import java.rmi.RemoteException; public interface Compute extends Remote { <T> T executeTask(Task<T> t) throws RemoteException; }