140 likes | 394 Views
A short introduction to Apache Accumulo. What is it and how does it relate to big table ? How does it use Hadoop,Zookeeper and Thrift in its implementation ?
E N D
Apache Accumulo • What is it ? • Design • Integrity • Administration • Squirrel www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – What is it ? • A key / value store • A column oriented database • Based on Google's Big Table • Based on • Apache Hadoop • Apache Zoo Keeper • Apache Thrift • Written in Java • Licensed by Apache www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – Design • Has cell level security via column visibility • Server side programming created via iterators • Table based constraints written in Java • Sharding can be used for parallel doc storage • Large rows can be larger than memory size www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – Integrity • Zookeeper used to manage master fail over • Write ahead logs written to each server • Logical time managed for • Consistant transactions • Bulk data import • Fate transactions ( Fault Tolerant Transactions ) • Transactions complete even after master failure • Isolation • Transactions see a consistant view of data at row level www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – Administration • System monitoring and stats via web page • System and table config stored in Zoo Keeper • Table naming stored in Zoo Keeper via id's • Follow threads of execution using tracing • Record time actions take place • Accumulo can be used with Squirrel server • As next slide shows • Future presentation will cover Squirrel www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – with Squirrel www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – Data Management Internal Data Management • Locality groups • Group columns within a single file • Smart compaction • Smaller files merged with larger using definable ratio until all files merged • Minor compaction • To avoid max files being reached in memory files merged with larger files • Loading user created jars • Load Jars from HDFS using VFS www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – Data Management On Demand Data Management • Compactions • Force tablets ( table partitions ) to compact to a single file • Tablet merging • Request tablet merging via shell • Table cloning • Clone a table from an existing one, reference data / config • Table import / export • Copy table / meta data to another cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Accumulo – Screen Shot www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems