290 likes | 1.31k Views
An Introduction to Apache Oozie, what is it and what is it used for ? How is it used with Hadoop ?
E N D
Apache Oozie • What is it ? • Why use it ? • Architecture • Examples www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Oozie – What is it ? • Work flow scheduler for Hadoop • Manages Hadoop Jobs • Integrated with many Hadoop apps i.e. Pig • Scaleable • Schedule jobs • A work flow is a collection of actions i.e. • map/reduce, pig, hfs • A work flow is • Arranged as a DAG ( direct acyclic graph ) • Graph stored as hPDL ( XML process definition ) www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Oozie – Why use it ? • It is designed for Hadoop • It is open source • It is designed for big data • It allows you to design task work flow • It allows you to interact with jobs • Stop, start, suspend, resume, rerun www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Oozie – Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Oozie – Architecture • Install Oozie on edge node / not on cluster • Oozie has client • Launches jobs and talks to server • Ozzie has server • Controls jobs • Launches jobs • Pipelines • Chained workflows • Work flow output • Is input to next www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Oozie – Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems