90 likes | 415 Views
An Introduction to Apache Giraph, what is it ? Graph processing ( BSP ) for Hadoop V2 ( YARN ).
E N D
Apache Giraph • What is it ? • How does it work ? • Dependencies • Examples www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Giraph – What is it ? • Graph processing for Hadoop V2 • For tasks that dont fit Map Reduce • Better performance for those tasks • Processing by interations called super steps • Uses Bulk Synchronous Parallel computing ( BSP ) • See Apache Hama presentation • Licensed via Apache • For distributed computing • For massive calculations www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Giraph – How does it work ? • Consider example • Input is chain graph • Find shortest path • Three super steps • Vertices have values • As do edges • Messages between steps www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Giraph – Dependencies • What does Apache Giraph need ? • Java 1.6 • Maven 3 or higher • ZooKeeper • Hadoop • Yarn ( 2.0.3-alpha ) or • Version 0.20.x • So Giraph is graph processing for Hadoop V2 !! • Based on Google Pregel www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Giraph – Examples • Consider the distance between friends problem • Facebook friends • ( and ) LinkedIn Connections • Shortest distance between friends • Its a graph • Process intensive to do as a Map Reduce job • See next two slides www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Giraph – Examples www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Giraph – Examples www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems