1 / 24

Simultaneous Transducers for Data-Parallel XML Parsing

Simultaneous Transducers for Data-Parallel XML Parsing. Yinfei Pan, Ying Zhang, and Kenneth Chiu. Computer Science SUNY Binghamton. Problem. It is well-known that XML parsing is slow (DOM style) What we want to address? Build faster XML DOM parser How?

eden
Download Presentation

Simultaneous Transducers for Data-Parallel XML Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simultaneous Transducers for Data-Parallel XML Parsing Yinfei Pan, Ying Zhang, and Kenneth Chiu Computer Science SUNY Binghamton

  2. Problem • It is well-known that XML parsing is slow (DOM style) • What we want to address? • Build faster XML DOM parser • How? • Using multiple cores to do parallel processing on single XML document

  3. Parallel parse Single XML document • We do it by: • Decompose the XML document into chunks • Parse each chunk in parallel, use one core per chunk • Merge the results of each chunk into final result • However, we met problem as: • Each chunk cannot be unambiguously parsed independently, since the true state of the parser when start a chunk is unknown until preceding chunks are parsed. • /foo • Element Content: <a>bar /foo</a> • Element end tag: …</bar></foo>…

  4. Previous work • We attempted to solve this problem in previous work by: • A very fast preparse scan (sequential) • Build an outline of the document (skeleton) • Skeleton are used to guide full parse by first decomposing XML document into well-formed fragments on well-defined unambiguous positions • The XML fragments are parsed separately on each core by Libxml2 APIs • Merge the results into final DOM with Libxml2 APIs

  5. Skeleton

  6. Problem of Previous Work • The preparse stage is sequential, which limits the overall performance scale only to 4 cores • Solution in this work: Parallelize the preparse stage • Preparse stage is simple and deterministic, and can easily be model into a Finite State Transducer (FST) • We call it FST since it will accept the XML document as input and output the skeleton. • We then transform this preparse finite state transducer (FST) to another finite state transducer, this transformed FST can start work at arbitrary start position of the XML document, so we then can just divide the XML document into equal sized chunks. This transformed FST is called the Simultaneous Finite Transducers (SFT)

  7. > C 3 5 > 2 C 4 / 6 ! ( END ) " C 1 C 2 0 1 < " C 1 > ( START ) > ' ( END ) C 5 3 7 4 / ' C 1 The preparsing FST • The FST for preparsing, it has 8 states and 2 actions.

  8. START END 0 1 3 3 0 0 0 1 2 2 0 Example of running preparsing DFA <foo>sample</foo>

  9. Simultaneous Finite Transducers (SFT) • The SFT Consider all possibilities, and generate the preparsing results for all possible initial states as a chunk is preparsed. • Its start state is simply the set of all states of the FST • Potentially, each state of the SFT is an element from the power set of the states of the original FST. • For the actual execution, the SFT transitions from a set of states to another set of states

  10. a a,b 0 1 2 a b a1 a2 {0 1 2} a b {1 2} {2} a1(0) a2(1) Execution of SFT Input: ab FST: Apply a1 and move result set Apply a2 and move result set [ {R0}, {R1}, {R2} ] [ { }, {R0, R1}, {R2} ] [ {}, {}, {R0, R1, R2} ] SFT:

  11. < a {0 1 2 3} a0(0) a1(1) > / a2(2) a4(3) a3(1) a a a / > > {1} {3} {0} {0 2 3} a3(1) a4(3) a2(2) a4(3) < a0(0) < a0(0) a a > {2} a1(1) a2(2) Transform FST to SFT

  12. Performance Evaluation • Machine 1: • Sun E6500 with 30 400 MHz US-II processors • Operating System: Solaris 10 • Compiler: g++ 4.0 with the option -O3 • Machine 2: • 8-core Linux machine with two Intel Xeon L5320 CPUs at 1.86 GHz. • Linux kernel version: 2.6.18 • Compiler: g++ 4.0 with the option -O3

  13. Test documents • File 1: 1kzk.xml of size 34MBytes from Protein Data Bank • File 2: gen1.xml of size 29 MBytes generated from XMark project • File 3: gen2.xml of size 19 Mbytes generated from XML Benchmark project

  14. 1kzk.xml

  15. gen1.xml

  16. gen2.xml

  17. Preparse Speedup on Solaris

  18. Preparse Speedup on Linux

  19. Full parsing speedups on Machine 1

  20. Full parsing speedups on Linux

  21. Thank you Questions?

  22. Select the true context and form the final correct result • When all chunks are finished, we can chain all the preparse results together to create the complete, correct skeleton. • The correct state at the end of the first chunk is unambiguous, since the first chunk starts in a known state. • This determines the state at the beginning of the second chunk, and is used to select the correct preparse result of the second chunk. • This in turn determines the correct state at the beginning of the third chunk • so on and so forth…

  23. Parallel efficiency (Machine 1)

  24. Parallel efficiency (Machine 2)

More Related