100 likes | 369 Views
A introduction to Apache Chukwa, what is it and how does it work ? Why is it important to monitor Hadoop DFS and how can it help us ?
E N D
Apache Chukwa • What is it ? • How does it work ? • What can we collect ? • Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Chukwa – What is it ? • For log collection and analysis • Designed for big data • Designed for Hadoop • Uses HDFS and MapReduce • Scaleable • Robust • Provides a tool kit to analyse logs www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Chukwa – How does it work ? • Chukwa agents on source nodes • Transfer data to collectors which save data to HDFS • Data sinks contain raw unsorted data • Data sinks clean data • Demux adds structure to create Chukwa records • Chukwa records go to database • Are ready to be analysed www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Chukwa – What can we collect ? • Metrics • System logs • Defined format • Undefined format • Low latency • Access to log data www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Chukwa – Architecture ? www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Chukwa – Architecture ? • Chukwa agents • Reside on the Hadoop machines • Collect raw data • Use adaptors for data sources • Use http to transmit data • Operate on data chunks • Can fail over between collectors www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems