1 / 50

Donghui Zhang CCIS, Northeastern University ccs.neu/home/donghui/research/neustore/

NEUStore: A Simple Java Package for the Construction of Disk-based, Paginated, and Buffered Indices. Donghui Zhang CCIS, Northeastern University http://www.ccs.neu.edu/home/donghui/research/neustore/. A DB index should be stored as a disk file. Let ’ s check what Java.io.RandomAccessFile

vega
Download Presentation

Donghui Zhang CCIS, Northeastern University ccs.neu/home/donghui/research/neustore/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NEUStore: A Simple Java Package for the Construction of Disk-based, Paginated, and Buffered Indices Donghui Zhang CCIS, Northeastern University http://www.ccs.neu.edu/home/donghui/research/neustore/

  2. A DB index should be stored as a disk file. Let’s check what Java.io.RandomAccessFile provides…

  3. java.io.RandomAccessFile • Views a file as a sequential list of bytes • Has operators seek, read, write // In the file “./mydata”, copy bytes 10-19 to 0-9. RandomAccessFile file = new RandomAccessFile(“mydata”, “rw”); byte[] buf = new byte[10]; file.seek(10); file.read(buf); file.seek(0); file.write(buf);

  4. java.io.RandomAccessFile • Views a file as a sequential list of bytes buf: 10 bytes in memory // In the file “./mydata”, copy bytes 10-19 to 0-9. RandomAccessFile file = new RandomAccessFile(“mydata”, “rw”); byte[] buf = new byte[10]; file.seek(10); file.read(buf); file.seek(0); file.write(buf);

  5. java.io.RandomAccessFile • Views a file as a sequential list of bytes buf: 10 bytes in memory // In the file “./mydata”, copy bytes 10-19 to 0-9. RandomAccessFile file = new RandomAccessFile(“mydata”, “rw”); byte[] buf = new byte[10]; file.seek(10); file.read(buf); file.seek(0); file.write(buf);

  6. java.io.RandomAccessFile • Views a file as a sequential list of bytes buf: 10 bytes in memory // In the file “./mydata”, copy bytes 10-19 to 0-9. RandomAccessFile file = new RandomAccessFile(“mydata”, “rw”); byte[] buf = new byte[10]; file.seek(10); file.read(buf); file.seek(0); file.write(buf);

  7. java.io.RandomAccessFile • Views a file as a sequential list of bytes buf: 10 bytes in memory // In the file “./mydata”, copy bytes 10-19 to 0-9. RandomAccessFile file = new RandomAccessFile(“mydata”, “rw”); byte[] buf = new byte[10]; file.seek(10); file.read(buf); file.seek(0); file.write(buf);

  8. java.io.RandomAccessFile • Views a file as a sequential list of bytes buf: 10 bytes in memory // In the file “./mydata”, copy bytes 10-19 to 0-9. RandomAccessFile file = new RandomAccessFile(“mydata”, “rw”); byte[] buf = new byte[10]; file.seek(10); file.read(buf); file.seek(0); file.write(buf);

  9. There is a gap between what Java.io.RandomAccessFile provides and what a DB index implementation needs.

  10. Consider implementing a B+-tree class BPlusTree { private RandomAccessFile file; private DBBuffer buffer; private int rootPageID; public BPlusTree(String filename, DBBuffer buffer); public void Insert(Key, Data); public void Delete(Key); public Data Search(Key); }

  11. Consider implementing a B+-tree class BPlusTree { …… public BPlusTree(String filename, DBBuffer buffer) { file = new RandomAccessFile(filename, “rw”); this.buffer = buffer; rootPageID = …… } …… }

  12. public void Insert(Key, Data) { // Find the leaf page where the record should be inserted, // pinning all pages along the path. int pageID = rootPageID; DBPage page = buffer.readPage(file, rootPageID); while ( page.nodeType != LEAF ) { buffer.pinPage(file, pageID); pageID = ((BPlusTreeIndexPage)page).Search(Key); page = buffer.readPage(file, pageID); } buffer.pinPage(file, pageID); // Insert the record into the leaf page ((BPlusTreeLeafPage)page).Insert(Key, Data); // Handle Overflow… }

  13. public void Insert(Key, Data) { // Continued …………………… if (page.isOverflow()) { int newPageID = allocate(); // in Delete, freePage() DBPage newPage = new DBPage(newPageID); Move half of the records from page to newPage. Read the right sibling and fix the sibling pointers. buffer.writePage(file, pageID, page); buffer.unpin(file, pageID); buffer.writePage(file, newPageID, newPage); buffer.writePage for the right sibling page. Recursive split if needed. } Unpin all pages. }

  14. 1 2 3 4 5 A DB index… • Uses a random access disk file to store data. • Views a file as a sequential list of pages. • Needs to support allocate and freePage. • Needs to manage the empty pages in the middle. • Needs to utilize a buffer… 0 6 7

  15. 1 1 4 5 5 4 3 2 2 3 5 4 3 2 1 file1 0 6 7 file2 0 6 7 file3 0 6 7 A set of DB indices readPage writePage A DB buffer file1, page0 file2, page3

  16. Summary of the gap • Java.io.RandomAccessFile provides: seek(), read(), write() • Implementing a DB index needs: allocate(), freePage(), buffer.readPage(), buffer.writePage()

  17. NEUStore fills the gap! • class DBIndex { allocate() freePage() } • class DBBuffer{ readPage() writePage() }

  18. Download and setup • See the PDF file! • Download the .tgz file and unzip. A neustoreroot directory will be created. • Use Eclipse to create a new project. • Use javadoc. • Smartly apply certain changes due to newer version of Java and Eclipse.

  19. neustoreroot has 3 directories • doc: the document generated using Javadoc. • neustore: the storage package. It contains: • base: the base of the storage package. It contains definitions of core classes DBIndex, DBBuffer, DBPage, and so on. • heapfile: a generic implementation of the heapfile. Here the class HeapFile was derived from DBIndex. • test: sample test programs. It contains: • naiveheapfile: implementation of a naive heapfile, and the test program for it. • heapile: test program for using neustore.heapfile.

  20. Packages and classes

  21. class DBPage • Abstract base class for a memory-version page. • E.g. derive BPlusTreeIndexPage from it. • Constructor: DBPage(int nodeType, int pageSize); creates a DBPage. nodeType -1 and 0 are preserved. 1 and above can be user-defined. E.g. nodeType 1 means BPlusTreeIndexPage and nodeType 2 means BPlusTreeLeafPage.

  22. class DBPage • Convert to/from disk-version page (byte[]): • abstract void read( byte[] page ); Reads the content of this DBPage from a byte array. • abstract void write( byte[] page ); Writes the content of this DBPage to a byte array. • Example usage of neustore.base.ByteArray: byte[] b = …… ByteArray ba = new ByteArray( b, ByteArray.READ ); int firstInt = ba.readInt( ); int secondInt = ba.readInt( );

  23. class DBBuffer • Constructor: DBBuffer(int bufferSize, int pageSize); bufferSize: # pages pageSize: # bytes/page • Some useful functions: • pin() • unpin() • clearIOs() • getIOs()

  24. DBBuffer.readPage(file, pageID) • What should the return type be? • DBPage -- problematic to implement: DBPage readPage(RandomAccessFile file, int pageID) { if (file, pageID) exists in the buffer return the corresponding DBPage object; else { choose a buffered page to throw away; read from the file pageSize bytes to a byte[]; convert to a DBPage and return; } } ?? but how??

  25. DBBuffer.readPage(file, pageID) • A DBBuffer is general. It buffers pages from different indices. • So whenever a page is physically read from disk, it should be kept as a byte[], period. • What if store byte[] in DBBuffer? • Works for C++: cast (char*) to (MyDBPage*) • Extremely inefficient for Java!! Every time to access a page, needs to convert from byte[].

  26. Solution: allow storing in both types! • When physically read, store as byte[]. Later when the application calls writePage(file, pageID, DBPage dbpage), store as DBPage. • return type of readPage should be: public class DBBufferReturnElement { public boolean parsed; public Object object; public int nodeType; } parsed means the object is a DBPage; and non-parsed means the object is a byte[].

  27. DBBuffer’s abstract functions • add(): to add a new page into buffer; • find(): to find a buffered page; • flush(): to empty the buffer while writing dirty pages to disk. • A derived class should implement them. • neustore.base includes a derived class: LRUBuffer. But the existing implementation did not use hashing.

  28. DBIndex • Constructor: DBIndex(DBBuffer buffer, String filename, boolean isCreate) Create or open a DBIndex. • int allocate() • void freePage(int pageID) • void close(): // should be called before terminating the application to write head page and flush buffer.

  29. DBIndex • The head page (having pageID=0) uses the first OVERHEAD=16 bytes to store four integers: • -1 : in every disk page, the first four bytes is nodeType. Type -1 means head page. Type 0 means empty page. • pagesize • numOfPages: number of non-empty pages • free: pageID of the first empty page. All empty pages are linked together. • The rest (pageSize – OVERHEAD) bytes store user-specified meta data, e.g. rootPageID.

  30. Meta data in DBIndex head page • To manage the meta data, a derived class needs to implement the three abstract functions: • initIndexHead(): called when the index is built; e.g. set rootPageID = allocate(); • readIndexHead(): called when the index is opened; e.g. read existing rootPageID; • writeIndexHead(): called right before the index is closed; e.g. write the updated rootPageID.

  31. Case study 1: naïve heap file • Simplified requirement: • Insert = add an integer at the end of the file. • Search = tell whether an integer exists. • Three files in neustoreroot/test/naiveheapfile: • NaiveHeapFilePage.java: derive from DBPage • NaiveHeapFile.java: derive from DBIndex • TestNaiveHeapFile.java: contains main()

  32. NaiveHeapFilePage.java: derive from DBPage • Needs to implement abstract functions: read() and write() • NaiveHeapFile.java: derive from DBIndex • TestNaiveHeapFile.java: contains main()

  33. public class NaiveHeapFilePage extends DBPage { private Vector<Integer> records; public NaiveHeapFilePage( int _pageSize ) { super(1, _pageSize); records = new Vector<Integer>(); } public int numRecs() { return records.size(); } public void insert( int key ) { records.add( new Integer(key) ); } public boolean search( int key ) {……} protected void read( byte[] b ) {……} protected void write( byte[] b ) {……} }

  34. public class NaiveHeapFilePage extends DBPage { …… public boolean search( int key ) { for ( Integer e: records ) { if (e.intValue() == key) return true; } return false; } protected void read( byte[] b ) {……} protected void write( byte[] b ) {……} }

  35. public class NaiveHeapFilePage extends DBPage { …… protected void read( byte[] b ) { ByteArray ba = new ByteArray(b, ByteArray.READ); ba.readInt(); // nodeType, ignore int numRecs = ba.readInt(); records.removeAllElements(); for ( int i=0; i<numRecs; i++ ) { int key = ba.readInt(); records.add(new Integer(key)); } } protected void write( byte[] b ) {……} }

  36. public class NaiveHeapFilePage extends DBPage { …… protected void write( byte[] b ) { ByteArray ba=new ByteArray(b, ByteArray.WRITE); int numRecs = records.size(); ba.writeInt( nodeType ); ba.writeInt( numRecs ); for ( Integer key: records ) { ba.writeInt( key.intValue() ); } } }

  37. NaiveHeapFilePage.java: derive from DBPage • NaiveHeapFile.java: derive from DBIndex • Needs to implement abstract functions initIndexHead(), readIndexHead() and writeIndexHead(). • TestNaiveHeapFile.java: contains main()

  38. public class NaiveHeapFile extends DBIndex { private int pageCapacity; // max # recs in a page private int lastPageID; // ID for last page private NaiveHeapFilePage lastPage; public NaiveHeapFile(DBBuffer buffer, String filename, boolean isCreate) {……} protected void initIndexHead() {……} protected void readIndexHead(byte[] head) {……} protected void writeIndexHead(byte[] head) {……} public void insert( int key ) {……} public boolean search( int key ) {……} protected NaiveHeapFilePage myReadPage( int pageID ) {……} }

  39. public class NaiveHeapFile extends DBIndex { …… public NaiveHeapFile(DBBuffer _buffer, String filename, boolean isCreate) { super(_buffer, filename, isCreate); pageCapacity = (pageSize-8)/4; // nodeType & numRecs if(isCreate) lastPage = new NaiveHeapFilePage(pageSize); else lastPage = myReadPage(lastPageID); } …… }

  40. public class NaiveHeapFile extends DBIndex { …… protected void initIndexHead() { lastPageID = allocate(); } protected void readIndexHead(byte[] head) { ByteArray ba = new ByteArray(head, ByteArray.READ); lastPageID = ba.readInt(); } protected void writeIndexHead(byte[] head) { ByteArray ba = new ByteArray(head, ByteArray.WRITE); ba.writeInt(lastPageID); } …… }

  41. public class NaiveHeapFile extends DBIndex { …… public void insert( int key ) { if ( lastPage.numRecs() >= pageCapacity ) { lastPageID = allocate(); lastPage = new NaiveHeapFilePage(pageSize); } lastPage.insert( key ); buffer.writePage(file, lastPageID, lastPage); } …… }

  42. public class NaiveHeapFile extends DBIndex { …… public boolean search( int key ) { for ( int currentPageID=1; currentPageID<=lastPageID; currentPageID++ ) { NaiveHeapFilePage currentPage = myReadPage(currentPageID); if ( currentPage.search(key) ) return true; } return false; } …… }

  43. public class NaiveHeapFile extends DBIndex { …… protected NaiveHeapFilePage myReadPage( int pageID ) { NaiveHeapFilePage page; DBBufferReturnElement ret = buffer.readPage(file, pageID); if ( ret.parsed ) { page = (NaiveHeapFilePage)ret.object; } else { page = new NaiveHeapFilePage(pageSize); page.read((byte[])ret.object); } return page; } }

  44. NaiveHeapFilePage.java: derive from DBPage • NaiveHeapFile.java: derive from DBIndex • TestNaiveHeapFile.java: contains main()

  45. public class TestNaiveHeapFile { public static void main(String[]args) { System.out.println( "***** Creating index *****" ); LRUBuffer buffer = null; buffer = new LRUBuffer( 5, 20 ); NaiveHeapFile hp = new NaiveHeapFile(buffer, filename, DBIndex.CREATE); System.out.println( "***** Inserting keys *****"); hp.insert(10); System.out.println( "***** Searching keys *****" ); hp.search(10); System.out.println( “***** Closing index ******” ); hp.close(); } }

  46. Limitations of the naïve heap file • A record has no data (only key). • No deletion is allowed. • Key type is fixed (as integer).

  47. Naïve heap file  neustore.heapfile • A record has no data (only key). Solution: define a private class HeapFileRecord, inside class HeapFilePage, which contains a key and a data. 2. No deletion is allowed. Solution: allow deletion. To enable leaving space in the middle, maintain a double linked list of full pages and a separate double linked list of non-full pages. • Key type is fixed (as integer). ?? How to implement a generic heapfile, which can be used for any type of key (and data)?

  48. Generic key (data) • Define two interfaces Key and Data. Use them as key and data types to HeapFile (and B+-tree you are about to implement). • Two important member functions of both Key and Data are: • public abstract void read(DataInputStream in): Reads the content of the Key object from an input stream; Example use: to convert a byte[] to a HeapFilePage, read the number of records in the page, and then repeatedly generate a Key object (and a Data object) and call its read function. • public abstract void write(DataOutputStream out): Writes the content of the object to an output stream. • neustore.base implements IntKey and StringData.

  49. A challenge in generic key (data) • In the implementation of the (generic) heap file, we need to create a Key object and call its read() member function. However, Key is an interface and therefore Key key = new Key() does not work!! • Solution: require the constructor of HeapFile (and also HeapFilePage) to take as input a sample key and a sample data. • When a new Key object is needed: sampleKey.clone();

  50. Summary

More Related