120 likes | 252 Views
Hbase Operations. SwatI Agarwal , Thomas Pan eBay Inc. Overview. Pre-production cluster handling production data sets and work loads Data storage for listed item drives eBay Search Indexing Data storage for ranking data in the future
E N D
Hbase Operations SwatIAgarwal, Thomas Pan eBay Inc.
Overview • Pre-production cluster handling production data sets and work loads • Data storage for listed item drives eBay Search Indexing • Data storage for ranking data in the future • Leverage map reduce in the same cluster to build search index
HBASE CluSTER • 225 Data nodes • Region server • Task Tracker • Data Node • 14 Enterprise Nodes • Primary Name Node • Secondary Name Node • Job Tracker Node • 5 ZooKeeper Nodes • HBaseMaster • CLI Node • Ganglia Reporting Nodes • Spare Nodes for Failover • Node Hardware • 12 2TB hard-drives • 72GB RAM • 24 cores under hyper-threading
Hadoop/HBase Configuration • Region Server • HBaseRegion Server JVM Heap Size: -Xmx15GB • HBaseRegion Server JVMNewSize: -XX:MaxNewSize=150m -XX:NewSize=100m (XX:MaxNewSize=512m) • Number of HBase Region Server Handlers: hbase.regionserver.handler.count=50 (Matching number of active regions) • HBase Region Server Lease Period: hbase.regionserver.lease.period=300000 (5 minutes for server side timeout as lease timeout) • Region Size: hbase.hregion.max.filesize=53687091200 (50GB to avoid automatic split) • Turn off auto major compaction: hbase.hregion.majorcompaction=0 • Read/Write cache configuration • HBase block cache size (read cache): hfile.block.cache.size=0.65 (65% of 15GB ~= 9.75GB) • HBaseRegion Server Memstore Upper Limit: hbase.regionserver.global.memstore.upperLimit=0.10 • HBase Region Server Memstore Lower Limit: hbase.regionserver.global.memstore.lowerLimit=0.09 • Scanner caching: hbase.client.scanner.caching=200 • HBase Block Multiplier: hbase.hregion.memstore.block.multiplier=4 (For memstore flush issue) • Client settings • HBaseRPC Timeout: hbase.rpc.timeout=600000 (10 minutes for client side timeout) • HBaseClient Pause: hbase.client.pause=3000 • Zoo Keeper • Maximum Client Count: hbase.zookeeper.property.maxClientCnxns=5000 • HDFS • Block Size: dfs.block.size=134217728 (128MB) • Data node xciever count: dfs.datanode.max.xcievers=131072 • Number of mappers per node: mapred.tasktracker.map.tasks.maximum=8 • Number of reducers per node: mapred.tasktracker.reduce.tasks.maximum=6 • Swap turned off
HBase Tables • Multiple tables in a single cluster • Multiple column families per table • Number of columns per column family: < 200. • 1.45 billion rows total • Max row size: ~20KB • Average row size: ~10KB • 13.01TB data • Bulk load speed: ~500 Million items in 30 minutes • Random write updates: 25K records per minute • Scan speed: 2004 rows per second per region server (average version 3), 465 rows per second per region server (average version 10) • Scan speed with filters: 325~353 rows per second per region server
Hbase Tables (cont.) • Pre-split 3600 Regions per table • Table is split into roughly equal sized regions. • Important to pick well distributed keys • Currently using bit reversal • Region split has been disabled by setting very large region size. • Major compaction on demand • Purge rows periodically • Balance regions among region servers on demand
RowKey Scheme and Sharding • RowKey • 64-bit unsigned integer • Bit reversal of document id • Document ID: 2 • RowKey: 0x4000000000000000 • HBase creates regions with even RowKey range • Each map task maps to each region.
MoNITORING Systems • Ganglia • Nagios Alerts • Table consistency – hbck • Table balancing – in-house tool • Region size • CPU usage • Memory usage • Disk failures • HDFS block count • …… • In-house Job Monitoring System • Based on OpenTSDB • Job Counters
CHALLENGES/ISSUES • HBase stability • HDFS issues can impact Hbase, such as name node failure • Map/Reduce jobs can impact HBase region servers, such as high memory usage • Region stuck in migration • HBase health monitoring • HBase table maintenance • HBase table regions become unbalanced • Major compaction after row purge and updates • Software Upgrades cause big downtime • Normal hardware failures may cause issues • Stuck regions due to failed hard disk • Region servers were deadlocked due to jvm • Testing
Future Direction • High scalability • Scale out a table with more regions • Scale out the whole cluster with more data • High availability • No downtime for upgrades • Adopt co-processor • Near-Real-Time Indexing
Community Acknowledgement • KannanMuthukkaruppan • KarthikRanganathan • Lars George • Michael Stack • Ted Yu • Todd Lipcon • Konstantin Shvachko