920 likes | 1.09k Views
This One Time, at PL Camp. Summer School on Language-Based Techniques for Integrating with the External World University of Oregon Eugene, Oregon July 2007. Checking Type Safety of Foreign Function Calls. Jeff Foster University of Maryland
E N D
This One Time, at PL Camp ... Summer School on Language-Based Techniques for Integrating with the External World University of Oregon Eugene, Oregon July 2007
Checking Type Safety of Foreign Function Calls Jeff Foster University of Maryland • Ensure type safety across languages • OCaml/JNI – C • Multi-lingual type inference system • Representational types • SAFFIRE • Multi-lingual type inference system
Dangers of FFIs • In most FFIs, programmers write “glue code” • Translates data between host and foreign languages • Typically written in one of the languages • Unfortunately, FFIs are often easy to misuse • Little or no checking done at language boundary • Mistakes can silently corrupt memory • One solution: interface generators
Example: “Pattern Matching” if (Is_long(x)) { if (Int_val(x) == 0) /* B */ ... if (Int_val(x) == 1) /* D */ ... } else { if (Tag_val(x) == 0) /* A */ Field(x, 0) = Val_int(0) if (Tag_val(x) == 1) /* C */ Field(x, 1) = Val_int(0) } type t = A of int | B | C of int * int | D
Garbage Collection • C FFI functions need to play nice with the GC • Pointers from C to the OCaml heap must be registered value bar(value list) { CAMLparam1(list); CAMLlocal1(temp); temp = alloc_tuple(2); CAMLreturn(Val_unit); } • Easy to forget • Difficult to find this error with testing
Multi-Lingual Types • Representational Types • Embed OCaml types in C types and vice versa
SAFFIRE • Static Analysis of Foreign Function InteRfacEs
Programming Models for Distributed Computing Yannis Smaragdakis University of Oregon • NRMI: Natural programming model for distributed computing. • J-Orchestra: Execute unsuspecting programs over a network, using program rewriting. • Morphing: High-level language facility for safe program transformation.
Network NRMI • Identify all reachable t tree 4 alias2 4 alias1 9 7 9 7 1 3 1 3 Client side Server side
Network NRMI • Execute remote procedure tmp t tree 4 alias2 4 2 alias1 9 7 0 9 1 3 1 8 Client side Server side
NRMI • Send back all reachable t tree 4 alias2 4 2 alias1 9 7 0 9 1 3 1 8 Network Client side
NRMI • Match reachable maps t tree 4 alias2 4 2 alias1 9 7 0 9 1 3 1 8 Network
NRMI • Update original objects t tree 4 alias2 4 2 alias1 0 9 0 9 1 8 1 8 Network
NRMI • Adjust links out of original objects t tree 4 alias2 4 2 alias1 0 9 0 9 1 8 1 8 Network
NRMI • Adjust links out of new objects t tree 4 alias2 4 2 alias1 0 9 0 9 1 8 1 8 Network
NRMI • Garbage collect t 4 alias2 2 alias1 0 9 1 8 Network
J-Orchestra • Automatic partition system • Works as bytecode compiler • lots of indirection using proxies, interfaces, local and remote objects • Partitioned program equivalent to original
Morphing • Ensure program generators are safe • Statically check the generator to determine the safety of any generated program, under All inputs • ensure that genrated programs compile • Early approach – SafeGen • Using theorem provers • MJ • Using types
Fault Tolerant Computing David August and David WalkerPrinceton University • Processors are becoming more susceptible to intermittent faults. • Moore’s Law, radiation • Alter computation or state, resulting in incorrect program execution. • Goal: Build reliable systems from unreliable components.
Topics • Transient faults and mechanisms designed to protect against them (HW). • The role of languages and compilers may play in creating radiation hardened programs. • New opportunities made possible by languages which embrace potentially incorrect behavior.
Software/Compiler • Duplicate instructions and check at important locations (store) [SWIFT, EDDI]
λzap • λ calculus with fault tolerance • Intermediate language for compilers • Models single fault • Based on replication • Semantics model type of faults let x1 = 2 inlet x2 = 2 inlet x3 = 2 inlet y1 = x1 + x1 inlet y2 = x2 + x2 inlet y3 = x3 + x3 inout [y1,y2,y3] let x = 2 inlet y = x + x inout y
Typing Ad Hoc Data Kathleen Fisher AT&T Labs • PADS project* • Data Description Language (DDL) • Data Description Calculus (DDC) • Automatic inference of PADS descriptions *http://padsproj.org
PADS • Declarative description of data source: • Physical format information • Semantic constraints Pstruct webRecord { Pip ip; " - - ["; Pdate(’:’) date; ":"; Ptime(’]’) time; "]"; httpMeth meth; " "; Puint8 code; " "; Puint8 size; " "; }; Parray webLog { webRecord[] }; type responseCode = { x : Int | 99 < x < 600}
Learning • Problem: Producing useful tools for ad hoc data takes a lot of time. • Solution: A learning system to generate data descriptions and tools automatically. Visual Information End-user tools Email struct { ........ ...... ........... } ASCII log files Binary Traces Raw Data Data Description CSV XML Standard formats & schema;
Format Inference Engine Chunked Data Tokenization IR to PADSPrinter PADSDescription Input File(s) StructureDiscovery FormatRefinement ScoringFunction
Multi-Staged Programming Walid Taha Rice University • Writing generic program that do not pay a runtime overhead. • Use program generators • Ensure syntactic well-formed, well-typed • MetaOCaml
I2 P Batch The Abstract View I1 I2 P1 P2 I2
MetaOCaml • Brackets (.< >.) • delay execution of an expression • Escape (.~ ) • Combine smaller delayed values to construct larger ones • Run (.! ) • Compile and execute the dynamically generated code
Power Example let rec power (n , x) = match n with 0 → 1 | n → x * (power (n-1, x));; let power2 (x) = power (2, x);;let power2 = fun x → power (2, x);; let power2 (x) = 1*x*x; let rec power (n, x) = match n with 0 → .<1>. | n → .<.~x * ~(power (n-1, x))>.;;let power2 = .! .<fun x → .~(power (2, .<x>.))>.;;
Scalable Defect Detection Manuvir Das, Daniel Wang, Zhe Yang, Microsoft Research • Program analysis at Microsoft scale • scalability, accuracy • Combination of weak global analysis and slow local one (for some regions of code) • Programmers are requires to add interface annotations • some automatic inference is available
Web and Database Application Security Zhendong Su University of California-Davis • Static analyses for enforcing correctness of dynamically generated database queries. • Runtime checking mechanisms for detecting SQL injection attacks; • Static analyses for detecting SQL injection and cross-site scripting vulnerabilities.
XML and Web Application Programming Anders Møller University of Aarhus • Formal models of XML schemas • Expressiveness of DTD, XML Schema, Relax NG • Type checking XML transformation languages • “Assuming that X is valid according to Sin is T(x) valid according to Sout?” • Web application frameworks • Java Servlets and JSP, JWIG, GWT
Types for Safe C-Level Programming Dan Grossman University of Washington • Cyclone, a safe dialect for C • Designed to prevent safety violations (buffer overflow, memory management, …) • Mostly underlying theory • Types, expression, memory regions
Analyzing and Debugging Software • Understanding Multilingual Software [Foster] • Parlez vous OCaml? • Statistical Debugging [Liblit] • you are my beta tester, and there’s lots of you • Scalable Defect Detection [Das, Wang, Yang] • Microsoft programs have no bugs
Programming Models • Types for Safe C-Level Programming [Grossman] • C without the ick factor • Staged Programming [Taha] • Programs that produce programs that produce programs... • Prog. Modles for Dist. Comp. [Smaragdakis] • We’ve secretly replaced your centralized program with a distributed application. can you tell the difference?
The Web • Web and Database Application Security [Su] • How not to be pwn3d by 1337 haxxors • XML and Web Application Programming [Møller] • X is worth 8 points in scrabble...let’s use it a lot
Other Really Important Stuff • Fault Tolerant Computing [August, Walker] • Help, I’ve been hit by a cosmic ray! • Typing Ad Hoc Data [Fisher] • Data, data, everywhere, but what does it mean?
Statistical Debugging Ben Liblit University Of Wisconsin-Madison
What’s This All About? • Statistical Debugging & Cooperative Bug Isolation • Observe deployed software in the hands of real end users • Build statistical models of success & failure • Guide programmers to the root causes of bugs • Make software suck less
Motivation “There are no significantbugs in our released softwarethat any significant numberof users want fixed.” Bill Gates, quoted in FOCUS Magazine
Software Releases in the Real World [Disclaimer: this may be a caricature.]
Software Releases in the Real World Coders & testers in tight feedback loop Detailed monitoring, high repeatability Testing approximates reality Testers & management declare “Ship it!” Perfection is not an option Developers don’t decide when to ship
Software Releases in the Real World Everyone goes on vacation Congratulate yourselves on a job well done! What could possibly go wrong? Upon return, hide from tech support Much can go wrong, and you know it Users define reality, and it’s not pretty Where “not pretty” means “badly approximated by testing”
Testing as Approximation of Reality • Microsoft’s Watson error reporting system • Crash report from 500,000 separate programs • x% of software causes 50% of bugs • Care to guess what x is? • 1% of software errors causes 50% of user crashes • Small mismatch ➙ big problems (sometime) • Big mismatch ➙small problem? (sometime!) • Perfection is not an economically viable option
Real Engineers Measure Things;Are Software Engineers Real Engineers?
Instrumentation Framework “The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong, it usually turns out to be impossible to get at or repair.” Douglas Adams, Mostly Harmless
Bug Isolation Architecture € ƒ ‚ ƒ € Predicates ShippingApplication ProgramSource Sampler Compiler StatisticalDebugging Counts& J/L Top bugs withlikely causes