1 / 18

HIVE

Nagarjuna K. HIVE. Pre Requisites . Knowledge about SQL. History . Built by Jeff’s tea at FaceBook A tool built for data warehousing on top of hadoop. Why HIVE. huge volumes of data FB producing burgeoning Social Network How to analyze the data ?. Hadoop EcoSystems. What is HIVE.

salma
Download Presentation

HIVE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nagarjuna K HIVE nagarjuna@outlook.com

  2. Pre Requisites • Knowledge about SQL nagarjuna@outlook.com

  3. History • Built by Jeff’s tea at FaceBook • A tool built for data warehousing on top of hadoop nagarjuna@outlook.com

  4. Why HIVE • huge volumes of data FB producing • burgeoning Social Network • How to analyze the data ? nagarjuna@outlook.com

  5. Hadoop EcoSystems nagarjuna@outlook.com

  6. What is HIVE • Tools to enable easy data extract/transform/load (ETL) • A mechanism to impose structure on a variety of data formats • Access to files stored either directly in Apache HDFSTM or in other data storage systems such as Apache HBaseTM • Query execution via MapReduce nagarjuna@outlook.com

  7. Who are using HIVE nagarjuna@outlook.com

  8. What is Hive For • What is hadoop for ? • && • adhoc batch processing of data. nagarjuna@outlook.com

  9. What is Hive not for • What is hadoop not for ? • real time data processing • row level updates nagarjuna@outlook.com

  10. What hive values most • What Hadoop values most ? • scalability • extensibility (MapReduce and UDF/UDAF/UDTF) • fault tolerance • loose coupling(input formats) nagarjuna@outlook.com

  11. Hive - Set Up • Setting Up hive • derby metastore nagarjuna@outlook.com

  12. Configuration files • hive –site.xml • $HIVE_HOME/conf/hive-site.xml • Alternate way • hive --config /Users/tom/dev/hive-conf • You have two or more clusters • You alternate frequently nagarjuna@outlook.com

  13. Hive Tables • Two types of tables • External Table • Table created on top of the existing data • delete the table  data still persistent • Normal Table • Tables location is in hives default location • delete the table  data gone nagarjuna@outlook.com

  14. Hive Usage • shell • $HIVE_HOME/bin/hive nagarjuna@outlook.com

  15. Hive Usage • describing a table • desc <table_Name> • Listing all the inbuilt functions • show functions; • Describing a function • desc function <function_name> nagarjuna@outlook.com

  16. Create Table • Employee1 | Name 1 |Address1|Phone 1 • create external table (Key1 String, Name Strng,Address String, Phone String) row format delimited fields terminated by ‘|’ location ‘/….’; nagarjuna@outlook.com

  17. Operations in hive • https://cwiki.apache.org/confluence/display/Hive/GettingStarted nagarjuna@outlook.com

  18. Operations in hive • https://cwiki.apache.org/confluence/display/Hive/GettingStarted nagarjuna@outlook.com

More Related