240 likes | 365 Views
Simultaneous Transducers for Data-Parallel XML Parsing. Yinfei Pan, Ying Zhang, and Kenneth Chiu. Computer Science SUNY Binghamton. Problem. It is well-known that XML parsing is slow (DOM style) What we want to address? Build faster XML DOM parser How?
E N D
Simultaneous Transducers for Data-Parallel XML Parsing Yinfei Pan, Ying Zhang, and Kenneth Chiu Computer Science SUNY Binghamton
Problem • It is well-known that XML parsing is slow (DOM style) • What we want to address? • Build faster XML DOM parser • How? • Using multiple cores to do parallel processing on single XML document
Parallel parse Single XML document • We do it by: • Decompose the XML document into chunks • Parse each chunk in parallel, use one core per chunk • Merge the results of each chunk into final result • However, we met problem as: • Each chunk cannot be unambiguously parsed independently, since the true state of the parser when start a chunk is unknown until preceding chunks are parsed. • /foo • Element Content: <a>bar /foo</a> • Element end tag: …</bar></foo>…
Previous work • We attempted to solve this problem in previous work by: • A very fast preparse scan (sequential) • Build an outline of the document (skeleton) • Skeleton are used to guide full parse by first decomposing XML document into well-formed fragments on well-defined unambiguous positions • The XML fragments are parsed separately on each core by Libxml2 APIs • Merge the results into final DOM with Libxml2 APIs
Problem of Previous Work • The preparse stage is sequential, which limits the overall performance scale only to 4 cores • Solution in this work: Parallelize the preparse stage • Preparse stage is simple and deterministic, and can easily be model into a Finite State Transducer (FST) • We call it FST since it will accept the XML document as input and output the skeleton. • We then transform this preparse finite state transducer (FST) to another finite state transducer, this transformed FST can start work at arbitrary start position of the XML document, so we then can just divide the XML document into equal sized chunks. This transformed FST is called the Simultaneous Finite Transducers (SFT)
> C 3 5 > 2 C 4 / 6 ! ( END ) " C 1 C 2 0 1 < " C 1 > ( START ) > ' ( END ) C 5 3 7 4 / ' C 1 The preparsing FST • The FST for preparsing, it has 8 states and 2 actions.
START END 0 1 3 3 0 0 0 1 2 2 0 Example of running preparsing DFA <foo>sample</foo>
Simultaneous Finite Transducers (SFT) • The SFT Consider all possibilities, and generate the preparsing results for all possible initial states as a chunk is preparsed. • Its start state is simply the set of all states of the FST • Potentially, each state of the SFT is an element from the power set of the states of the original FST. • For the actual execution, the SFT transitions from a set of states to another set of states
a a,b 0 1 2 a b a1 a2 {0 1 2} a b {1 2} {2} a1(0) a2(1) Execution of SFT Input: ab FST: Apply a1 and move result set Apply a2 and move result set [ {R0}, {R1}, {R2} ] [ { }, {R0, R1}, {R2} ] [ {}, {}, {R0, R1, R2} ] SFT:
< a {0 1 2 3} a0(0) a1(1) > / a2(2) a4(3) a3(1) a a a / > > {1} {3} {0} {0 2 3} a3(1) a4(3) a2(2) a4(3) < a0(0) < a0(0) a a > {2} a1(1) a2(2) Transform FST to SFT
Performance Evaluation • Machine 1: • Sun E6500 with 30 400 MHz US-II processors • Operating System: Solaris 10 • Compiler: g++ 4.0 with the option -O3 • Machine 2: • 8-core Linux machine with two Intel Xeon L5320 CPUs at 1.86 GHz. • Linux kernel version: 2.6.18 • Compiler: g++ 4.0 with the option -O3
Test documents • File 1: 1kzk.xml of size 34MBytes from Protein Data Bank • File 2: gen1.xml of size 29 MBytes generated from XMark project • File 3: gen2.xml of size 19 Mbytes generated from XML Benchmark project
Thank you Questions?
Select the true context and form the final correct result • When all chunks are finished, we can chain all the preparse results together to create the complete, correct skeleton. • The correct state at the end of the first chunk is unambiguous, since the first chunk starts in a known state. • This determines the state at the beginning of the second chunk, and is used to select the correct preparse result of the second chunk. • This in turn determines the correct state at the beginning of the third chunk • so on and so forth…