1 / 15

Parallel Computation of Skyline Queries Verification

Parallel Computation of Skyline Queries Verification. COSC6490A Fall 2007 Slawomir Kmiec. Presentation Outline. Skyline Concepts The Parallel Algorithm JPF Experience JPF Issues Abstraction Results Future Work Summary Questions. Skyline Concepts.

noreen
Download Presentation

Parallel Computation of Skyline Queries Verification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Computation ofSkyline QueriesVerification COSC6490A Fall 2007 Slawomir Kmiec

  2. Presentation Outline • Skyline Concepts • The Parallel Algorithm • JPF Experience • JPF Issues • Abstraction • Results • Future Work • Summary • Questions

  3. Skyline Concepts In a set of points (or records) identify points that are better than (i.e. not worse than) any of the others by a given set of their attributes. Point pa is said to dominate point pb if for all i such that 1 ≤ i ≤ d we have xi(pa) ≤ xi(pb) , and at least one of those inequalities is strict. A point p is a skyline point if it is not dominated by any other point in S. The skyline of S is denoted sky(S).

  4. The Parallel Algorithm (A) • Principles: → data divided equally and distributed → local skyline is computed at each peer → size of the local skyline is shared with peers → if combined results fit on any processor → local skylines are exchanged with peers then → processor pi picks ith chunk of the combined skyline and eliminates points in it that the combined skyline dominates → local results are sent to the central process → end // of processing

  5. The Parallel Algorithm (A cont.)

  6. The Parallel Algorithm (B) • Principles (continued) → else // combined results do not fit on some pi → loop until required number of results is available or all pi have finished do → each processor pi picks a random set of points (in proportion of his local skyline) → this set is submitted to all peers that mark point that they dominate and marked points are returned to sender → each processor pi collects back points submitted to peers and removes marked ones from the original set but sends the remaining ones to the central processor → end loop → end // of processing

  7. The Parallel Algorithm (B cont.)

  8. JPF Experience • getting JPF • getting JPF to run • the Eclipse way • the Linux way • incremental examples • configuration options • JPF value-added services

  9. JPF Issues • independent processors- restricted to threads • eliminate native code classes- no Swing, Sockets, NIO, Regex (Eclipse)- out of 15 just java.util.ArrayList left- eliminate Socket-oriented developed classes • search-state-space reduction- input: 10 points- 2 worker threads- operation abstraction- output discarded

  10. Abstraction • 2 types of developed classes left SkylineMain and SkylineWorker - workflow classes “Handler” classes - request handling classes SkylineMain Thread SkylineMainListener ServeSocket SkylineMainHandler Socket SkylineWorker Thread SkylineWorkerListener ServerSocket SkylineWorkerHandler Socket

  11. Abstraction (cont.) • high volume of work:- due to a lot of original code • removed all GUI:- remove Swing and AWT elements • asynchronous Socket messaging done as:- keep references to workers instead of addresses- eliminate the “Listener” classes- each message done as an instance of the handler- create a handler for the destination worker- execute synchronous (blocking) part of data sending- start handler to execute asynchronous processing- each type of messages split into synch- and asynch- part • file IO done as:- store parameters as static constants- store input data as an array- replace input scanning with referencing the array- display or discard output • String.split() method (Regex) done as:- re-done as a String manipulation method

  12. Results • issues reported - different issues at different settings - large volume of output to be analyzed • uncaught-exception conditions - issues regarding un-synchronized access - the above as IllegalMonitorStateException • dead-lock conditions - issues regarding termination conditions • PreciseRaceDetector -“Unprotected Variable Access” severe warnings • possibly more - it ran for a long time with no other errors - it did not finish in the time given

  13. Future Work • atomize code - wrap code fragments into atomic operations • protect shared variable access - use locks of synchronized blocks - re-run PreciseRaceDetector • run it for an extended period of time - to search the complete state space • analyze the applicability of issues found - wrt the applicability to the original app - not as a result of the abstraction or transformation • reduce shared data interaction - handlers to create private data structures to be quickly accepted by corresponding main process - this will allow greater robustness and redundancy

  14. Summary • JPF is a flexible and complex tool • JPF is memory- and time- intensive • JPF is a valuable verification tool • the application had to be changed extensively to work with JPF • potential issues were found by JPF • verification = value-added service extra testing code refinement (robustness)

  15. Questions ???

More Related