140 likes | 231 Views
A short introduction to mongodb, what is it and how does it work ? How can it be used with Hadoop to process big data ?
E N D
MongoDB • What is it ? • Features • Tools • Use with Hadoop • Hadoop Tools www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – What is it ? • Document oriented NoSql database • BSON schema data format ( Binary JSON ) • Released as open source / free • Can be used as a distributed database • Has load balancing • Has replication • Written in C++ • Licensed via Apache www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – Features • Queries • By field • By regular expression • User defined java script functions • By range • Indexes • Primary and secondary • Any document field • Replication • Master can replicate to multiple slaves www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – Features • Load balancing • Data split across multple shards • DB scales using shards • New machines can be added to running database • Map reduce can be used for aggregation • File storage via GridFS • Load balanced file system • File system with replication • Functions available for file manipulation www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – Tools • Mongo – a db access shell and admin tool • Mongostat – a status tool similar to vmstat • Mongotop – top processes like Unix top command • Mongosniff – low level traffic sniffing • Mongoimport – import JSON, CSV, TSV plus others • Mongoexport – export tool ( as import ) • Mongodump – dump database contents • Mongostore – reload database dumps www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – With Hadoop • Hadoop connector available from github • Allows Hadoop I/O • Compiles with SBT build tool • Supports Hadoop • 0.20/0.20.x • 1.0/1.0.x • 1.1/1.1.x • 0.21/0.21.x • CDH3 • CDH4 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – Attributes The image on the left shows how Hadoop and its tools are used with MongoDB via a connector. The image on the right shows MongoDB attributes. www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – Hadoop Tools • The Hadoop connector supports • Map Reduce • Pig • Hadoop streaming • Flume • Hive • Hive BSON file access • MongoDB can use HDFS for storage www.semtech-solutions.co.nz info@semtech-solutions.co.nz
MongoDB – Architecture • A db server • has many databases • A database • Has many collections • A collection • Has many documents www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems