1 / 29

Search with a Key-Value Store

Search with a Key-Value Store . Intro to NoSQL. Key-value store Schemaless Distributed Eventually Consistent. Key-Value. Single unique key for each value in the database Extremely fast look-up Easy distribution (no such thing as joins). Schemaless. Critical for extremely large data sets

dylan
Download Presentation

Search with a Key-Value Store

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Search with a Key-Value Store

  2. Intro to NoSQL • Key-value store • Schemaless • Distributed • Eventually Consistent

  3. Key-Value • Single unique key for each value in the database • Extremely fast look-up • Easy distribution (no such thing as joins)

  4. Schemaless • Critical for extremely large data sets • No alter table commands, each value has no pre-defined fields

  5. Distributed • Data set is designed to be shared across multiple machines • Typically makes use of commodity servers with enough RAM to keep the entire data set in memory

  6. Eventually Consistent • Replica nodes are not notified of changes before a success response is returned to the client • Makes NoSQL problematic for highly sensitive transactions (finance, etc)

  7. Database Design in NoSQL • Denormalization is your friend • Think of collections as views on a data set that

  8. A News Site Using SQL

  9. Loading a Story with SQL SELECT * FROM comments LEFT JOIN users ON users.id = comments.user_id LEFT JOIN comments children ON children.parent_id = comments.id WHERE story_id = x SELECT * FROM stories

  10. Redesigned in a NoSQL Data Store Story #dgi3ck date headline content comments Comment #la529 content username user_image_url user_id children Comment #mn34i content username user_image_url user_id Comment #5bg26 content username user_image_url user_id children

  11. Loading a Story with NoSQL Stories::get(dgi3ck)

  12. Some Design Considerations • What is the context in which we will access this data? • What data do we need to access outside the of this context? • How often does the data change?

  13. Embedded Data • NoSQL can support foreign keys • Some data is more appropriately stored “embedded” in a parent context • E.g. Comments are rarely (if ever) accessed outside of their parent Story

  14. Cached Data • Data from an object that needs to be accessed outside of the current context can be cached • Keep in mind that it may need to be updated • E.g. a user changes his username, Comments can be updated

  15. Several common NoSQL Stores • Memcached • BigTable • SimpleDB • MongoDB

  16. Why we chose MongoDB • Auto-sharding and easy setup for distribution • JavaScript API • Powerful indexing capabilities

  17. MongoDB Libraries • ORM: mongo_mapper • https://github.com/jnunemaker/mongomapper • Underlying Connection: mongo • https://github.com/mongodb/mongo-ruby-driver • BSON support: bson_ext • http://rubygems.org/gems/bson_ext

  18. Lifebooker’s Availability Search • Searches across Services • Filters • Time/Date • Geographical Zone • Service Category • Practitioner Gender • Concurrent Availability • (and several more)

  19. Services, Discounts and Practitioners • Services are offered by Providers • Providers have Practitioners (Employees) • Discounts are applied to Providers for a Service in a given time

  20. Modeling this Data in MongoDB

  21. Embedding with MongoMapper

  22. Indexing and Searching • Mongo offers powerful indexing capabilities • Arrays are “first-class citizens” • Complex indices allow for great performance

  23. Creating Meta-Data • With complex data structures, creating meta-data before_save will allow you to make that data easily searchable • E.g. the maximum discount on a given day for a service

  24. Creating Indices

  25. Querying • Uses DataMapper/Arel Syntax • Chains conditions, ordering and offset

  26. Filtering Complex Data Structures • MongoDB offers a JavaScript API for MapReduce • Map - transform and filter data • Reduce - combine multiple rows into a single record

  27. A Simple Use-Case

  28. Using MapReduce to Filter Filter

  29. The Results • Scheduled to go live within 2 weeks • With sharding/distribution, tests show almost no dip in response time with more than 10x the current data set • 20x faster than MySQL implementation • 100ms vs 2000ms (or more)

More Related