430 likes | 506 Views
A 1 Cycle-Per-Byte XML Accelerator. Zefu Dai, Nick Ni and Jianwen Zhu Presented by Zefu Dai. University of Toronto. What is XML. Extensible Markup Language A Platform independent tool for data exchange and representation Widely used in: Web service Database system Scientific application
E N D
A 1 Cycle-Per-Byte XML Accelerator Zefu Dai, Nick Ni and Jianwen Zhu Presented by Zefu Dai University of Toronto University of Toronto
What is XML • Extensible Markup Language • A Platform independent tool for data exchange and representation • Widely used in: • Web service • Database system • Scientific application • … University of Toronto
Performance Threat: XML Parsing • 70 mins loading 3 GB XML file, 26x slower than loading plain text • >1s per bank transaction, how many transactions per day? • Average 175 K insts parsing 1KB XML data (IBM XML4C) • With network speed reaching tens of Gbps, XML Parsing speed outstands network improvement as the performance bottleneck University of Toronto
Previous work • Cycle Per Byte (CPB) = Average cycle to process each byte of XML data • Multi-core Acceleration • Require a pre-parsing process, done sequentially • 30 CPB on a 4-core processor • SIMD Acceleration • without in memory tree construction and validation • 6-15 CPB • Hardware Accelerator • Most commercial products do not reveal performance metric and design details • 10-40 CPB University of Toronto
Our Design • Causes of the parsing slowdown • Text-based Data Stream • Variable-length string comparison • Poor memory performance due to streaming and memory back-tracing • An XML Parsing Accelerator implemented in FPGA • Fixed-length string operation • Optimized circuits for string comparison • Common case optimized stallable pipeline • data structure for high bandwidth on-chip memory • Achieve 1 CPB processing speed and saturate 1 Gbps Ethernet link, running at 125 MHz University of Toronto
Outlines • Background • High-level architecture • Design Details • Evaluation University of Toronto
Tasks of XML Parser • Well-formed Checking • Check if the document confirms to XML syntax rules • Schema Validation • Check if the document confirms to XML semantic rules specified in DTD or Schema files • DOM Construction • Capture the parental relationship between elements and attributes and store them into memory in Document Object Model (DOM) format University of Toronto
Well-formed Checking example • Has an unique root element University of Toronto
Well-formed Checking example • Has an unique root element • Elements must be closed and nested properly University of Toronto
Well-formed Checking example • Has an unique root element • Elements must be closed and nested properly • Unique attributes within an element • … University of Toronto
XML Schema Example • Specify permitted child elements/attributes University of Toronto
XML Schema Example • Specify permitted child elements/attributes • Specify type of content University of Toronto
XML Schema Example • Specify permitted child elements/attributes • Specify type of content • Specify occurrence limit • … University of Toronto
DOM Construction • Create in-memory tree structure for XML • Provide application accesses through tree operations University of Toronto
Outlines • Background • High-level architecture • Design Details • Evaluation University of Toronto
Top Level Diagram University of Toronto
Top Level Diagram <Elem attr=‘xyz’> content </elem> University of Toronto
Top Level Diagram <Elem attr=‘xyz’>content</Elem> University of Toronto
Top Level Diagram <Elem attr=‘xyz’> content </Elem> University of Toronto
Top Level Diagram Elemattr xyz content Elemattr xyz content University of Toronto
Top Level Diagram rule name rule content H(Elem) H(attr) Elemattr xyz content Elemattr xyz content University of Toronto
Top Level Diagram rule name Elem attr rule content xyz content Elem content attr xyz University of Toronto
Outlines • Background • High-level architecture • Design Details • Evaluation University of Toronto
Recurring Idioms (Dwarfs) • Identified 3 recurring computational idioms (referred to as Dwarfs) • One-to-one String Matching • One-to-many String Membership Test • One-to-many String Search • One of the major reasons accounting for low performance University of Toronto
Dwarf I: One-to-one String Matching • Tests if a subject string equals to a reference string • Example: correct nesting • The string is variable-length • Not efficient on conventional architecture • Solution: memory stack • Convert variable-length string comparison to fixed-length character comparison University of Toronto
Dwarf II: One-to-many String Membership Test • Tests if a subject string equals to any member of a set of reference strings • Example: unique attribute within an element • String comparison against all previously arrived attributes belonging to the same element • Expensive memory back-tracing • Solution: Bloom Filter • achieved in one memory lookup University of Toronto
Dwarf III: One-to-many String Search • “Finds” a subject string among a set of reference strings (different to just “test”) • Example: Search for corresponding schema rule • string comparison against all candidates • Undeterministic look up time • Solution: Balance Routing Table Scheme • Achieved in one memory lookup University of Toronto
Dwarf II: Bloom Filter • Example: attribute name uniqueness checking • Common case: attribute name is unique • Filter out obvious cases using Bloom Filter • Lookup into a bit array instead of compare strings • Uncommon case: attribute name may already exists • Stall the entire design • Do all necessary string comparisons to confirm the existences of the incoming sting • Assumption: low occurring rate (high cost) University of Toronto
Solution II: Bloom Filter • For each attribute name: • Generate N independent hash codes • Look up the bit array • Update the bit array University of Toronto
Solution II: Bloom Filter • For each attribute name: • Generate N independent hash codes • Look up the bit array • Update the bit array University of Toronto
Solution II: Bloom Filter • For each attribute name: • Generate N independent hash codes • Look up the bit array • Update the bit array University of Toronto
Solution II: Bloom Filter • For each attribute name: • Generate N independent hash codes • Look up the bit array • Update the bit array Unique! University of Toronto
Solution II: Bloom Filter • For each attribute name: • Generate N independent hash codes • Look up the bit array • Update the bit array False Positive! University of Toronto
Bloom Filter Implementation • Implement the Bloom Filter algorithm in a pipeline • Attribute name usually has multiple characters • Allow multiple processing cycles for each attribute name University of Toronto
Outlines • Background • High-level architecture • Design Details • Evaluation University of Toronto
Experimental Setup • Software XML parsers test • XML Parsing Accelerator testbed University of Toronto
Benchmarks University of Toronto
Test Results • Metric: Raw Throughput (Gbps) University of Toronto
Test Results • Metric: Cycle Per Byte University of Toronto
Scalability Examination • Bloom Filter efficiency • Test Attribute Name Uniqueness circuit with generated test files • Count the number of false positives University of Toronto
Implementation Cost Target Device: Xilinx Virtex-5 XC5VSX50T University of Toronto
Conclusion • FPGA is a valid contender in XML processing • Low clock frequency requirement to achieve high throughput • Scalable to process large XML documents • Moderate hardware cost to achieve high performance • Future work • Fully conformance to XML specification University of Toronto
Questions? University of Toronto