1 / 11

HBase

HBase. Presented by Chintamani Siddeshwar Swathi Selvavinayakam. http:// www.slideshare.net / amansk /hbase-hadoop-day-seattle-4987041. HBase. Open source BigTable HDFS as underlying DFS ZooKeeper as lock service Tight integration with Hadoop MapReduce. Why HBase ?.

Download Presentation

HBase

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HBase Presented by ChintamaniSiddeshwar SwathiSelvavinayakam http://www.slideshare.net/amansk/hbase-hadoop-day-seattle-4987041

  2. HBase • Open source BigTable • HDFS as underlying DFS • ZooKeeperas lock service • Tight integration with HadoopMapReduce

  3. Why HBase ? • Scales out to thousands of nodes • Access granularity is a row – read/write to a single row is atomic • Designed for workloads consisting of simple operations on individual items • Provides efficient access to random rows • Allows dynamic repartitioning of data

  4. Data Model • Sparse • Distributed • multi dimensional • persistent • Sorted • map • (row, column, timestamp) -> cell • Column = Column Family : Column Qualifier

  5. System Structure

  6. Other Features • Compression • In memory column families • Multiple masters • Rolling restart • Bloom filters • Efficient bulk loads • Source and sink for Hive, Pig, Cascading

  7. Use Cases • Mozilla • Yahoo! • Twitter • Facebook • Adobe

  8. HBase v/s RDBMS • Column Oriented • Flexible schema, add columns on the fly • Good with sparse tables • No query language • De-normalize your data • No transactions • Row Oriented ( mostly) • Fixed schema • Not optimized for sparse tables • SQL • Normalize as you can • Transactional

  9. Related Chapters • Big Data • Data Modelling

  10. References • http://ofps.oreilly.com/titles/9781449396107/intro.html • http://wiki.apache.org/hadoop/Hbase/DataModel • http://www.slideshare.net/amansk/hbase-hadoop-day-seattle-4987041 • http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf

More Related