HBase

HBase A column-centered database

Overview • An Apache project • Influenced by Google’s BigTable • Built on Hadoop • A distributed file system • Supports Map-Reduce • Goals • Scalability • Versions • Compression • In memory tables

Architectural issues • Cluster of nodes is general architecture • Standalone mode for single machine • There is a Java API accessed with JRuby • There is a JRuby shell

Modeling constructs • Table • Has a row key • A series of column families • Each has a column name and a value • Operations • Create table • Insert a row with “Put” command • Only one column at a time • Query a table with a “Get” command • (uses a table name and a row key)

Filters • Scan • can get a series of rows based on two key values • Can provide a filter for such things as column families, timestamps • Filters can be pushed to the server

Updating • When a column value is written to the db, old values are kept and organized by timestamp • Each such value is a cell • You can explicitly assign timestamps manually • Otherwise, current timestamp with insert • When getting, uses most recent version • Operations that alter column family structures is expensive

Other characteristics • Text compression • Rows are stored in order by key value • A region is some set of rows • Each is stored in a single region server • Regions can be automatically merged and split • Uses write-ahead logging to prevent loss of data with node failures • This is called journaling in Unix file systems • Supports a master/slave multi-cluster strategy

An HBase clustertaken from: http://www.packtpub.com/article/hbase-basic-performance-tuning

Tasks of components • Zookeeper cluster is a coordination service for the HBase cluster • Finds the correct server • Selects the master • Master allocates regions & load balancing • Region servers hold the regions • Hadoop supports Map-Reduce

Some key concepts • De-normalization • Fast random, key-row retrieval • Use of a multi-component architecture to leverage existing software tools • Controllable in-memory selection

HBase

HBase

Presentation Transcript

HBase

HBase

Hbase : Hadoop Database

Hbase : Hadoop Database

HBase Tracing

HBASE

HBase Snapshots

Introduction to Hbase

HBase Dev Meetup

HBase

Hbase Operations

Hue HBase Browser

Hbase : Hadoop Database

HBase

Hadoop, HBase, and Healthcare

HBase at Xiaomi

HBase Programming

HBase Programming

HBase

HBase Mohamed Eltabakh