230 likes | 240 Views
Efficient Representation of Data Structures on Associative Processors. Jalpesh K. Chitalia (Advisor Dr. Robert A. Walker) Computer Science Department Kent State University. Presentation Outline. ASC Processor Architecture Associative Features Structure Codes Represent Data Structures
E N D
Efficient Representation of Data Structures on Associative Processors Jalpesh K. Chitalia (Advisor Dr. Robert A. Walker) Computer Science Department Kent State University
Presentation Outline • ASC Processor Architecture • Associative Features • Structure Codes • Represent Data Structures • Structure Code Operations • Implementation of Structure Codes • Summary and Future Work
The ASC Processor • A scalable design implemented on a million gate Altera FPGA • SIMD-like architecture • Currently, 36 8-bit Processing Elements (PE) available • 8-bit Instruction Stream (IS) control unit with 8-bit Instruction and Data addresses, 32-bit instructions
The ASC Architecture • Each PE listens to the IS through the broadcast and reduction network • PEs can communicate amongst themselves using the PE Network • PE may either execute or ignore the microcode instruction broadcast by IS under the control of the Mask Stack
The ASC Features • Associative Search • Each PE can search its local memory for a key under the control of IS • Responder Resolution • A special circuit signals if ‘at least one’ record was found • Masked Operation • Local Mask Stacks can turn on or off the execution of instruction from IS
The ASC Example Select * from Students where Grade > 90
The ASC Features • Constant Time Associative Operations • Associative Search • Finding minimum or maximum in a field • Ideal for Database processing • Data is organized in tabular format • Each tuple in a table can be processed by one PE • PE Network • Many parallel algorithms require all PEs to move contents in a regular pattern • E.g.: Image Convolution, Matrix Multiplication
Non Tabular Data Structures • Many applications use linked list based data structures • For example, plain HTML parsing can be done using tree-structure • Similarly, XML or Object-relational databases can be represented using trees only • Game programming uses tree-based algorithms, and demand much of processing power • Complier construction uses directed acyclic graphs and tree structures
Data Structure Codes • A unique coding scheme • Allows representation of any data structure into a tabular format • Tabular format allows division of data amongst the PEs • Also known as “structure code” • Different coding schemes for different data structures may be required • Uses Associative Search feature of the ASC Processor
Simple List-based Structures • The left figure shows a possible representation of 1D and 2D arrays using structure codes • The right figure represents stack and queue, depending on the use of appropriate functions
Complex List-based Structures • Each digit-position indicates the level of a tree • Each value in that position indicates the position of that child from the left • Discussions henceforth are confined to trees • Graphs are read in a slightly different manner
Structure Code Operations • Two sets of constant-time operations: scalar and parallel • Scalar Instructions are simple mathematic operations • Search parent, child or root • Finds value of structure code for next or previous nodes • Parallel Instructions use complex associative operations • Finds code for next, previous or both siblings • ‘Locates’ the required node for further processing
Scalar Operations • fstcd: leftmost child • nxtcd: right sibling • prvcd: left sibling • trncd: truncate (parent) node • trnacd: truncate all (root) node • Can be used to allocate a new node • Limited use in searching records
Parallel Operations • Index instructions: • Flags a node of the result to ease further processing • nxtdex (next or right), prvdex (previous or left) and sibdex (siblings or both left and right) • Value instructions: • Returns the exact structure code of the result • nxtval (next or right) and prvval (previous or left) • Can be used to ‘locate’ nodes in any tree • Uses parallel and associative hardware resources
Applications of DSC Ops • Scalar • Associative searching: the parent or the root in a given data structure space • Allocating elements in a data structure space • Value • Finding the value of resultant structure code • Index • Most useful in associative search (eg, tree traversal)
Implementation Requirements • Hardware functionality • Debugging and developing I/O functionality • Debugging few other instructions • Structure Code Operations • Scalar Operations: nxtcd, prvcd • Parallel Operations: nxtdex, nxtval, prvdex, prvval, sibdex • Parallel Operations • Most complex set amongst ASC operations
Implementation – Value Ops • Processing of Parallel Value Ops • Associative search, associative min/max • Input Cycle • All the elements of data structure are distributed amongst the PEs sequentially • Reference node is broadcast to them • Output Cycle • Multi-byte structure code is stored in destination address (scalar memory)
Implementation – Index Ops • Processing of Parallel Index Ops • The resultant element(s) is (are) ‘marked’ for further processing • Note: sibdex may select two results • Input and Output • Input: same as in value operations • Output: Bit flags for all the input nodes are stored sequentially in destination address (scalar memory)
Summary • SIMD-based computers are more suited for database processing • ASC processor with its associative operations makes them more efficient • Structure codes translate non-tabular data structures into a tabular format • Tree and Graphs can be represented and evenly divided like records in a table • Object-relational databases, HTML processing • Stacks, Queues, multi-dimensional arrays can be efficiently processed • Structure codes not required for even representation
Future Work • Efforts are in place to clean the architecture • To allow multiplier/divider on each PE • To accommodate RISC instruction set • Unpacked bytes are used to support variable-length structure code • Can be avoided with an efficient divider unit • Input/Output • Special structure code operations are defined but not implemented in this work • Parallel input/output is also required • Developing applications that use structure codes
Acknowledgements • Professor Walker • Professor Potter • Committee members for their time • Kevin Schaffer, Hong Wang, Meiduo Wu, Lei Xie, Ping Xu
Thank You! Questions?