210 likes | 373 Views
Edmunds’ Pomelo : Automobile Dealership Analytics in Real Time using MongoDB April 3 rd , 2012 Greg Rokita, Sharat Nair Edmunds.com , Inc. Prepared by Gregory Rokita. Assumptions. Understanding of MongoDB Experience with Java
E N D
Edmunds’ Pomelo: Automobile Dealership Analytics in Real Time using MongoDB • April 3rd, 2012 • Greg Rokita, Sharat Nair • Edmunds.com, Inc Prepared by Gregory Rokita
Assumptions • Understanding of MongoDB • Experience with Java • Basic understanding of serialization protocols e.g. Thrift, Protocol Buffers • Basic understanding of messaging protocols e.g. JMS
Agenda • Edmunds • Scale of Big Data operations • Use case for Pomelo Application • System Overview & Design • Real time integration with MongoDB • Real time data creation for MongoDB • Implementation • MongoDB Consumer • MongoDB REST service • Q&A
Edmunds.com and Scale • Premier online resource for automotive information launched in 1995 as the first automotive information Web site • 15 million unique visitors • 210 million page views • 1 million+ new inventory items per day • 2 TB of new data every month • 40 node Hadoop cluster aggregating logs, transactions, calls, referrals, advertising, vehicle, pricing, inventory and other data sets
Pomelo Application • Analytics tool for Automotive Dealers and Edmunds’ Dealer Sales • Performance measurement for Edmunds traffic and its correlation to calls & referrals • iPad, HTML5, Sencha Touch & Charts
Targeting MongoDB - Producer-Consumer matching GenericThrift Producer MongoDB Consumer DealerMetrics Queue DealerMetrics Virtual Topic Publish DealerMetrics Publish DealerMetrics Prod LAX Edmunds GTP Test EC2 Edmunds MongoDB Broker Destination Interceptor Prod, Test Lax, EC2 Edmunds MongoDB Prod LAX, EC2 Edmunds GTP
Integration with MongoDB – layered architecture for transport Thrift Camel ActiveMQ Type safety, versioning and service Retries and error handling Message persistence, durability and failover
Mongo Connection <bean id="mongo” class="com.edmunds...MongoDBConnectionFactory"> <property name="address" value="pl1db470.media.edmunds.com:27017,pl1db471.media.edmunds.com:27017"/> </bean>
Mongo Connection - cont’d @Autowired public MongoDbDealerMetricsConsumer(Mongo mongo) { collection = mongo.getDB(DB_NAME).getCollection(COLLECTION_NAME); collection.ensureIndex(new BasicDBObject(LAST_ACTIVE_DATE, -1)); }
Mongo consumer private void processDealerMetrics(DealerMetricsdealerMetrics) throws TException { String cddId = dealerMetrics.getCddDealershipId(); BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); DBObjectdmObj = (DBObject) JSON.parse(serializeToJson(dealerMetrics)); /* query - query to match fields - fields to be returned sort - sort to apply before picking first document remove - if true, document found will be removed update - update to apply returnNew - if true, the updated document is returned, otherwise the old document is returned (or it would be lost forever) upsert - do upsert (insert if document not present) */ collection.findAndModify(query, null, null, false, dmObj, true, true); }
Public interface to Mongo - Dealer public List<DBObject> getDocument(String cddId) { final BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); final DBObject object = collection.findOne(query); object.removeField(OBJECT_ID); object.removeField(LAST_ACTIVE_DATE); return newArrayList(object); }
Public interface to Mongo - Active list public List<DBObject> getActiveList() { final BasicDBObject query = new BasicDBObject(); query.put(LAST_ACTIVE_DATE, getActiveDate()); query.put(DMA_NAME, getDmaCriteria()); final BasicDBObject keys = new BasicDBObject(); keys.put(OBJECT_ID, 0); keys.put(CDD_ID, 1); keys.put(DEALERSHIP_NAME, 1); return collection.find(query, keys).toArray(); } private Object getActiveDate() { return collection.find().sort(getSortCriteria()).next().get(LAST_ACTIVE_DATE); } private BasicDBObjectgetSortCriteria() { return new BasicDBObject(LAST_ACTIVE_DATE, -1); } private BasicDBObjectgetDmaCriteria() { return new BasicDBObject("$in", DMAS); }
Rest service @GET @Path("{id}") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> get(@PathParam("id") String cddId) { return dealerMetricsMongoDao.getDocument(cddId); } @GET @Path("list") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> getDealerList() { return dealerMetricsMongoDao.getActiveList(); }
Q&A Greg Rokita grokita@edmunds.com Sharat Nair snair@edmunds.com