Simultaneous Transducers for Data-Parallel XML Parsing

Simultaneous Transducers for Data-Parallel XML Parsing Yinfei Pan, Ying Zhang, and Kenneth Chiu Computer Science SUNY Binghamton

Problem • It is well-known that XML parsing is slow (DOM style) • What we want to address? • Build faster XML DOM parser • How? • Using multiple cores to do parallel processing on single XML document

Parallel parse Single XML document • We do it by: • Decompose the XML document into chunks • Parse each chunk in parallel, use one core per chunk • Merge the results of each chunk into final result • However, we met problem as: • Each chunk cannot be unambiguously parsed independently, since the true state of the parser when start a chunk is unknown until preceding chunks are parsed. • /foo • Element Content: <a>bar /foo</a> • Element end tag: …</bar></foo>…

Previous work • We attempted to solve this problem in previous work by: • A very fast preparse scan (sequential) • Build an outline of the document (skeleton) • Skeleton are used to guide full parse by first decomposing XML document into well-formed fragments on well-defined unambiguous positions • The XML fragments are parsed separately on each core by Libxml2 APIs • Merge the results into final DOM with Libxml2 APIs

Skeleton

Problem of Previous Work • The preparse stage is sequential, which limits the overall performance scale only to 4 cores • Solution in this work: Parallelize the preparse stage • Preparse stage is simple and deterministic, and can easily be model into a Finite State Transducer (FST) • We call it FST since it will accept the XML document as input and output the skeleton. • We then transform this preparse finite state transducer (FST) to another finite state transducer, this transformed FST can start work at arbitrary start position of the XML document, so we then can just divide the XML document into equal sized chunks. This transformed FST is called the Simultaneous Finite Transducers (SFT)

> C 3 5 > 2 C 4 / 6 ! ( END ) " C 1 C 2 0 1 < " C 1 > ( START ) > ' ( END ) C 5 3 7 4 / ' C 1 The preparsing FST • The FST for preparsing, it has 8 states and 2 actions.

START END 0 1 3 3 0 0 0 1 2 2 0 Example of running preparsing DFA <foo>sample</foo>

Simultaneous Finite Transducers (SFT) • The SFT Consider all possibilities, and generate the preparsing results for all possible initial states as a chunk is preparsed. • Its start state is simply the set of all states of the FST • Potentially, each state of the SFT is an element from the power set of the states of the original FST. • For the actual execution, the SFT transitions from a set of states to another set of states

a a,b 0 1 2 a b a1 a2 {0 1 2} a b {1 2} {2} a1(0) a2(1) Execution of SFT Input: ab FST: Apply a1 and move result set Apply a2 and move result set [ {R0}, {R1}, {R2} ] [ { }, {R0, R1}, {R2} ] [ {}, {}, {R0, R1, R2} ] SFT:

< a {0 1 2 3} a0(0) a1(1) > / a2(2) a4(3) a3(1) a a a / > > {1} {3} {0} {0 2 3} a3(1) a4(3) a2(2) a4(3) < a0(0) < a0(0) a a > {2} a1(1) a2(2) Transform FST to SFT

Performance Evaluation • Machine 1: • Sun E6500 with 30 400 MHz US-II processors • Operating System: Solaris 10 • Compiler: g++ 4.0 with the option -O3 • Machine 2: • 8-core Linux machine with two Intel Xeon L5320 CPUs at 1.86 GHz. • Linux kernel version: 2.6.18 • Compiler: g++ 4.0 with the option -O3

Test documents • File 1: 1kzk.xml of size 34MBytes from Protein Data Bank • File 2: gen1.xml of size 29 MBytes generated from XMark project • File 3: gen2.xml of size 19 Mbytes generated from XML Benchmark project

1kzk.xml

gen1.xml

gen2.xml

Preparse Speedup on Solaris

Preparse Speedup on Linux

Full parsing speedups on Machine 1

Full parsing speedups on Linux

Thank you Questions?

Select the true context and form the final correct result • When all chunks are finished, we can chain all the preparse results together to create the complete, correct skeleton. • The correct state at the end of the first chunk is unambiguous, since the first chunk starts in a known state. • This determines the state at the beginning of the second chunk, and is used to select the correct preparse result of the second chunk. • This in turn determines the correct state at the beginning of the third chunk • so on and so forth…

Parallel efficiency (Machine 1)

Parallel efficiency (Machine 2)

Simultaneous Transducers for Data-Parallel XML Parsing

Simultaneous Transducers for Data-Parallel XML Parsing

Presentation Transcript

Transducers

Parsing for XML Developers

Transducers

Transducers

Parsing data records

Parsing java with XML

Parallel XML Parsing Using Meta-DFAs

Parsing XML

Java XML parsing

Java XML Parsing Specification

Data-Oriented Parsing

XML Parsing Using Java

Parsing java with XML

XML Parsing Using Java APIs

Parsing XML Code With Python

Parsing XML sequence?

XML Apache Tools Parsing and Transformation

Regular Expressions and XML Parsing

Parsing XML sequence?

Transducers

Parsing XML into programming languages