130 likes | 174 Views
Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query, and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications for more details, please visit us :http://www.orienit.com
E N D
Database Vs Data Warehouse Historical Data is the back bone of any business for mission critical business decisions. Data is stored in some form of tables in the database.So why the Business Intelligence systems are using Data Warehouse rather than Database to pull historical data? What is the difference between Database and DataWarehouse while both of them have some tables with Data, Index and key etc.? Here are Differences…. http://www.orienit.com/
DATA BASE • Used for Online Transactional Processing (OLTP). This records the data from the user for history. • The tables and joins are complex since they are normalized. This is done to reduce redundant data and to save storage space. • Entity – Relational modeling techniques are used for database design. • Optimized for write operation. • Performance is low for analysis queries. http://www.orienit.com/
DATA WAREHOUSE • Used for Online Analytical Processing (OLAP). This reads the historical data for the Users for business decisions. • The Tables and joins are simple since they are de-normalized. This is done to reduce the response time for analytical queries. • Data – Modeling techniques are used for the Data Warehouse design. • Optimized for read operations. • High performance for analytical queries. http://www.orienit.com/
General Data Flow :(Ex: Online Insurance Registration) • Customer enters the details in the Online Registration form. • The details are saved into the Database when the customer presses the Submit button in the form. • Business Intelligence Team of the Insurance Company uses an ETL tool to pull the data from the Database tables to the Data Warehouse tables. • Business Management uses Business Reporting Tools to pull Data from Data Warehouse tables for generating business reports. http://www.orienit.com/
STRING FUNCTIONS IN HIVE The string functions in Hive are listed below: 1.ASCII( string str ): • The ASCII function converts the first character of the string into its numeric ascii value. Example1: ASCII('hadoop') returns 104 Example2: ASCII('A') returns 65 2.CONCAT( string str1, string str2... ) • The CONCAT function concatenates all the stings. Example: CONCAT('hadoop','-','hive') returns 'hadoop-hive' http://www.orienit.com/
3. CONCAT_WS( string delimiter, string str1, string str2... ) • The CONCAT_WS function is similar to the CONCAT function. Here you can also provide the delimiter, which can be used in between the strings to concat. Example: CONCAT_WS('-','hadoop','hive') returns 'hadoop-hive' 4. FIND_IN_SET( string search string, string source_string_list) • The FIND_IN_SET function searches for the search string in the source_string_list and returns the position of the first occurrence in the source string list. Here the source string list should be comma delimited one. It returns 0 if the first argument contains comma. Example: FIND_IN_SET('ha','hao,mn,hc,ha,hef') returns 4 http://www.orienit.com/
5. LENGTH( string str ) : • The LENGTH function returns the number of characters in a string. Example: LENGTH('hive') returns 4 6. LOWER( string str ), LCASE( string str ) • The LOWER or LCASE function converts the string into lower case letters Example: LOWER('HiVe') returns 'hive' 7. LPAD( string str, int len, string pad ) • The LPAD function returns the string with a length of len characters left-padded with pad. Example: LPAD('hive',6,'v') returns 'vvhive' http://www.orienit.com/
8. LTRIM( string str ) : • The LTRIM function removes all the trailing spaces from the string. Example: LTRIM(' hive') returns 'hive' 9. REPEAT( string str, int n ): • The REPEAT function repeats the specified string n times. Example: REPEAT('hive',2) returns 'hivehive' 10.RPAD( string str, int len, string pad ) : • The RPAD function returns the string with a length of len characters right-padded with pad. Example: RPAD('hive',6,'v') returns 'hivevv' http://www.orienit.com/
11. REVERSE( string str ) : • The REVERSE function gives the reversed string. • Example: REVERSE('hive') returns 'evih' 12. RTRIM( string str ): • The RTRIM function removes all the leading spaces from the string. Example: LTRIM('hive ') returns 'hive' 13. SPACE( int number_of_spaces ) : • The SPACE function returns the specified number of spaces. Example: SPACE(4) returns ' ' 14. SPLIT( string str, string pat ) : • The SPLIT function splits the string around the pattern pat and returns an array of strings. You can specify regular expressions as patterns. Example: SPLIT('hive:hadoop',':') returns ["hive","hadoop"] http://www.orienit.com/
15. SUBSTR( string source_str, int start_position [,int length] ), SUBSTRING( string source_str, int start_position [,int length] ) : • The SUBSTR or SUBSTRING function returns a part of the source string from the start position with the specified length of characters. If the length is not given, then it returns from the start position to the end of the string. • Example1: SUBSTR('hadoop',4) returns 'oop' • Example2: SUBSTR('hadoop',4,2) returns 'oo' 16.TRIM( string str ) : • The TRIM function removes both the trailing and leading spaces from the string. Example: LTRIM(' hive ') returns 'hive' http://www.orienit.com/
17.UPPER( string str ), UCASE( string str ) : • The UPPER or LCASE function converts the string into upper case letters. Example: UPPER('HiVe') returns 'HIVE' For More Information : http://www.kalyanhadooptraining.com/ http://www.orienit.com/
Address: Flat no : 204, 2nd floor, Annapurna Block, Aditya Enclave, Ameerpet, Hyderabad-16. Contact us: 040 6514 2345, +91 970 320 2345 Email ID: info@OrienIT.com http://www.orienit.com/