1 / 25

A Comprehensive Model for Arbitrary Result Extraction

A Comprehensive Model for Arbitrary Result Extraction. Neal Sample, Gio Wiederhold Stanford University Dorothea Beringer Hewlett-Packard. Shift in Programming Tasks. Integration/Composition. Coding. 1970 1990 2010. Sample Composition Tasks. Logistics

ranit
Download Presentation

A Comprehensive Model for Arbitrary Result Extraction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Comprehensive Model for Arbitrary Result Extraction Neal Sample, Gio Wiederhold Stanford University Dorothea Beringer Hewlett-Packard

  2. Shift in Programming Tasks Integration/Composition Coding 1970 1990 2010

  3. Sample Composition Tasks • Logistics • Reservation and distribution systems, “find the best transportation route from A to B” • Genomics • Framework for composing various processing tools and repositories • Modeling • Weather prediction, complex chemical systems, basin modeling • Composition of processes (vs. components, data)

  4. CLAM Composition Language • Purely compositional • no primitives for arithmetic • no primitives for I/O, etc. • Splitting up CALL-statement • parallelism by asynchrony in sequential program • novel possibilities for optimizations • reduction in complexity of invocation statements • Higher-level language • assembly  HLLs • HLLs  compositional paradigm • Intent: Enable domain experts

  5. CLAM Primitives Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: for optimization Invocation and result gathering: INVOKE: begin execution EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service

  6. Data Dependencies & Scheduling START service1 service2 service3 service4 service5 END // begin program A = service1(); B = service2(); C = service3(A,B); D = service4(C); E = service5(C); // end of program

  7. Runtime: data extraction is hard • Data extraction with native modules worked • No language-level specifications in CLAM • E.g., Polling, threading, exception handling… • Multiple middleware for transport  difficult mapping • CORBA-RMI, RMI-COM, COM-CPAM, etc. • Crisis of legacy services • To generalize or restrict? • Refine the strategy…

  8. Strategy: hide it & depend on it • Have to respect service capabilities • Or suffer the LCD… (more in a bit) • Simple and flexible programming • Data extractions is a runtime issue, it is not central to composition task • Simplified Integration • Legacy ambivalence • Simple bridging for middleware • Increase audience for services • Better scheduling • Declarative language, data dependencies

  9. Where are we? • Declarative language for composition • Data is used synchronization • No primitives to support synchronization • Apparent “mismatch” in data extraction methods & capabilities among various actors • What does the data look like? • How can data be extracted?

  10. Data View: Services RESULTS Result A Result B Result C

  11. Extraction Techniques • Asynchrony • Explicitly controlled: spin-locks, polling, interrupt handling, etc. • Can use with any DAG schedule • Partial extraction • web browsing - HTML text as a schema • SQL cursors (thanks to the reviewer) • Progressive extraction (exceptional) • Adaptive mesh refinements, JPEG interleaving

  12. Current Focus Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: for optimization Invocation and result gathering: INVOKE: begin execution EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service

  13. Current Focus Pre-invocation: SETUP: set up the connection to a service SET-, GETPARAM: in a service ESTIMATE: for optimization Invocation and result gathering: INVOKE: begin execution EXAMINE: test progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation/connection to a service

  14. EXAMINE Primitive in CLAM • Returns “status” and “progress” • Status – 2 bits of state • status = {DONE, NOT_DONE, PARTIAL, ERROR} • Progress – open descriptor • Indicates progress in application specific-way • Could be variance, mean, amplitude, etc. • Default assumption: integer 0-100 = % done • Resolution of EXAMINE • Can apply per service (black box) • Can apply per result (white box) • Not complete for many legacy systems:only “status”, no “progress”

  15. EXAMINE Service A B C Service.EXAMINE()  {PARTIAL, 40} Service.EXAMINE(A)  {DONE, 100} Service.EXAMINE(B)  {NOT_DONE, 0} Service.EXAMINE(C)  {PARTIAL, 20} Service A B C Service.EXAMINE()  {DONE, 100} Service.EXAMINE(A)  {DONE, 100} Service.EXAMINE(B)  {DONE, 100} Service.EXAMINE(C)  {DONE, 100} Service A B C Service.EXAMINE()  {NOT_DONE, 0} Service.EXAMINE(A)  {NOT_DONE, 0} Service.EXAMINE(B)  {NOT_DONE, 0} Service.EXAMINE(C)  {NOT_DONE, 0}

  16. EXTRACT Primitive • Extracts data from a service • Per service (black box) • (var) = Service.EXTRACT(); • Per result (white box) • (varA = A, varC = C) = Service.EXTRACT(); • Allows partial data extraction • saves volume: abandon uninteresting elements • saves time: termination of useless invocation • Allows progressive data extraction with 2-value EXAMINE (status+progress) • Steering, time saving

  17. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extractionCLAM

  18. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extractionCLAM

  19. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM

  20. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM

  21. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM

  22. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM

  23. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM

  24. Examine-Extract Relationship EXTRACT per service per result per servicestatus only asynchronousprocedure call, Java RMI limited Partial Extraction,(binary) thumbnails per servicestatus+progress partitionedprogressive extract(full result set) ? EXAMINE per resultstatus only semantic partialextraction(full result set) partial extraction browsing, SQL cursor(no progressive) per resultstatus+progress progressiveextraction(full result set) progressive andpartial extraction*CLAM

  25. Conclusions • Data extraction hiding is bueno! • User is not responsible for data management • Synchronizing extractions not in the language  simplicity • Enables effective service scheduling • Simplified integration • Blueprint for proactive design pattern for future services

More Related