250 likes | 566 Views
MongoDB on Azure. Agenda Me / iQmetrix Architecture NoSQL databases MongoDB 10gen Running MongoDB in Azure History, Issues MMS [if we have time]. John Woakes. I work for iQmetrix as a Lead Developer Started as Cobol programmer Oracle Developer/DBA SQL Server and .NET
E N D
MongoDB on Azure • Agenda • Me / iQmetrix • Architecture • NoSQL databases • MongoDB • 10gen • Running MongoDB in Azure • History, Issues • MMS [if we have time]
John Woakes • I work for iQmetrix as a Lead Developer • Started as Cobol programmer • Oracle Developer/DBA • SQL Server and .NET • johnw@iqmetrix.com • This is my first presentation – be gentle.
iQmetrix • Started in Regina 1999-ish • Vancouver, Winnipeg and Charlotte • Creating Software for the mobile retail industry • RQ4 – Retail Management System • XQ – Interactive Retail
Azure Component • 4 Web Roles – Http API and Console • 3 Worker Roles – MongoDB replica sets • 1 Worker Role – background functions • 1 Azure Queue • Several Service Bus Queues • Blob storage for images/video and backups • Table storage for logs
Clients • About 1000 devices • 4 Client customer facing applications • Adplay media player • Browse app for detailed product information • Stream app • iPad Browse • RQ4 updates prices and quantities
NoSQL Databases • Relatively new • Often open source • Designed to work well in the cloud • Scale out not up • Good for big data • Not a replacement for Relational Databases
NoSQL Databases • MongoDB - http://www.mongodb.org • CouchDB • RavenDB • BigTable • Cassandra
Users of MongoDB • CraigsList • FourSquare • EA • GitHub • Disney • MTV • Loggly
Why MongoDB? • What uses the most CPU/IO in Relational Databases? • Transactions • Joins • MongoDB is document based (no joins, no transactions) • Documents can contain sub-documents • Can index sub-documents and arrays
Change the way you think • Important points • Schema design is very important • There is no schema in MongoDB • Resist Normalization • Uses a memory mapped model • Goes faster with lots of servers and lots of RAM
10gen • CEO developed MongoDB • It is open source and is on GitHub– C++ • 10gen http://www.10gen.com/ provides • Support • Training inc. online videos • Conferences worldwide • Drivers in lots of languages [including .NET] • Code and documentation inc. Mongo in Azure
Main Points • Stores data as bson • Console uses JavaScript and json • 3rd party GUI tools • Database runs as an exe or Windows service • Memory mapped so really needs 64bit OS • User group on Google Groups [busy] • C# driver has LINQ support and full functionality • Shardingfor fast access and huge storage
Main Points cont. • Good for terabytes of data and billions of records • Replica Sets for high availability [auto failover] • Sharding spreads queries over multiple servers • Geospatial data and functions • GridFS for storing large binary data
Azure • Microsoft and 10gen worked together to make MongoDB work well in Azure • Microsoft support and promote MongoDB on Azure • Microsoft Open Technologies, Inc. • 10gen have code and documentation on their site to run MongoDB on Azure
Azure Options • Run mongo on an Azure Virtual Machine • You have to look after the OS and Firewall • You could use Linux or Windows • It is more hands on • If you have a VM already then you can use that. • Use an Azure Data Drive for persistence
Azure Options • Use an Azure Worker Role to host MongoDB • PaaS so MS looks after OS • Runs on internal IP addresses so no Firewall concerns • This is what we use and will now describe
Azure Code Original article that got us going http://captaincodeman.com/2010/05/24/mongodb-azure-clouddrive/ 10gen’s documentation http://www.mongodb.org/display/DOCS/MongoDB+on+Azure+Worker+Roles 10gen’s source code https://github.com/mongodb/mongo-azure/
Azure Code • Use Worker Roles to run MongoDB • Need Azure SDK 1.7 [June 2012] • Use an Azure Cloud Drive to persist the database • Download the MongoDB binaries http://www.mongodb.org/downloads [get the 64bit 2008+] • In the worker role ServiceConfiguration file add • <ServiceConfiguration osFamily="2" osVersion="*" • <Role name="MongoWorker" vmName="MongoDB">
Azure Code • Include the MongoDB binaries you need in the Worker Role project [set to content and Copy Always] • mongod.exe – database engine • mongo.exe – console • mongodump.exe – used to backup data • mongorestore.exe – used to restore data • mongostat.exe – useful for real time stats
Azure Code • Mount a Cloud Drive with Blob storage backing • Run the mongod.exe in a Process • Set up command line arguments • Hook up listeners on the std and error outputs • Start the process • Once the database is running execute the initialize replica set command
Azure Code Download the C# driver from NuGet Other role instances in the hosted service can talk to the database with this driver. Use role instance host names when setting up the connection string. Host names are zero based and will follow the pattern NAMExnot 1 based and NAME0x as the doc tell you.
History • Went to PDC about 2 years ago • Got Mongo running on a single instance • Went into production about 1 year ago • Added replica sets about 6 months ago • 4 week iterations – just moved 2 week
Issues • Outages when on one instance • Alpha MongoDB code using replica sets • Azure outage on Service Bus • An unknown outage just this month • Backup data • Make your code resilient as possible • Log everything • Monitor logs and have alerts
MMS Very cool and useful free service from 10gen Install a Python Agent on a instance that has access to the MongoDB servers. The agent sends stats data to a 10gen server You then login to the MMS dashboard website and monitor your databases in real time. http://www.mongodb.org/display/DOCS/MongoDB+Monitoring+Service