1 / 132

Big Data Technologies: HDFS -- Map-Reduce && NoSQL DBs

This article discusses the foundational technologies in Big Data, including HDFS, Map-Reduce, and NoSQL databases. It covers concepts like distributed hash tables and the Chord algorithm.

latia
Download Presentation

Big Data Technologies: HDFS -- Map-Reduce && NoSQL DBs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data Technologies:HDFS -- Map-Reduce && NoSQL DBs S.Sioutas, Ionian University Ion Stoica, http://inst.eecs.berkeley.edu/~cs162

  2. Big Data Technology is based on: Hash Functions(Input: Filesf.e. strings -- output: hash-keys) • Folding Method: int h(String x, int D) { inti, sum; for (sum=0, i=0; i<x.length(); i++) sum+= (int)x.charAt(i); return (sum%D);/* D is the cluster size*/ } • sums the ASCII values of the letters in the string • ASCII value for “A” =65; sum will be in range 650-900 for 10 upper-case letters; good when D around 100, for example • order of chars in string has no effect

  3. Big Data Technology is based on: Distributed Hash Tables (DHTs) • Distribute (partition) a hash table data structure across a large number of servers • Also called, key-value store • Key identifier = SHA-1(key), Node identifier = SHA-1(IP address) • Each key_id is mapped to the node_id with the smallest node_id >= key_id • Two operations • put(key, data); // insert “data” identified by “key” • data = get(key); // get data associated to “key” key, value Sorted Key-value stores into DHT table …

  4. Hadoop Distributed File System (HDFS) • Files split into 128MB blocks • Blocks replicated across several datanodes (often 3) • Namenode stores metadata (file names, locations, etc) • Optimized for large files, sequential reads • Files are append-only Namenode File1 1 2 3 4 1 2 1 3 2 1 4 2 4 3 3 4 Datanodes

  5. HadoopCluster

  6. Typical Hadoop Cluster • 40 nodes/rack, 1000-4000 nodes in cluster • 1 Gbps bandwidth in rack, 8 Gbps out of rack • Node specs (Facebook):8-16 cores, 32-48 GB RAM, 10×2TB disks Aggregation switch Rack switch

  7. The lookup cluster architecture of NoSQL DB (f.e. Cassandra) • N1, N2…..Nx are computing nodes of the same rack • M1, M2…..My are computing nodes of the same rack • Each rack is structured as a Chord overlay-network • The whole CLUSTER (CLOUD) is structured as a Chord overlay Between rack-switches (Each rack-switch talks directly to it’s Master node)

  8. Hadoop Components • Distributed file system (HDFS) • Single namespace for entire cluster • Replicates data 3x for fault-tolerance • MapReduce framework • Runs jobs submitted by users • Manages work distribution & fault-tolerance • Colocated with file system

  9. Distributed Hash Tables (DHTs) (cont’d) • Just need a lookup service, i.e., given a key (ID), map it to machine n n = lookup(key); • Invoking put() and get() at node m m.put(key, data) { n= lookup(key); // get node “n” mapping “key” n.store(key, data); // store data at node “n” } data = m.get(key) { n= lookup(key); // get node “n” storing data associated to “key” return n.retrieve(key); // get data stored at “n” associated to “key” }

  10. Chord Lookup Service (Protocol) • Support of just oneoperation: given a key, Chordmapsthekeyonto a node • Associate to each node and item, a unique id/key in an uni-dimensional space 0..2m-1 • Partition this space across N machines • Each key is mapped to the node with the smallest largest id (consistent hashing) • Key design decision • Decouple correctness from efficiency • Properties • Routing table size O(log(N)), where N is the total number of nodes • Guarantees that a file is found in O(log(N)) steps

  11. The Abstraction: Distributed hash table (DHT)

  12. The lookup problem N2 N1 N3 Key=“title” Value=MP3 data… Cloud Publisher N4 N6 ? N5 Client Lookup(“title”)

  13. Routed queries (Freenet, Chord, etc.) N4 N1 N6 Client N2 Lookup(“title”) Publisher Key=“title” Value=MP3 data… N7 N3 N8 N9

  14. Chord software • 3000 lines of C++ code • Library to be linked with the application • provides a lookup(key) – function: yields the IP address of the node responsible for the key • Notifies the node of changes in the set of keys the node is responsible for

  15. The Chordalgorithm –ConstructionoftheChord ring • theconsistenthashfunctionassignseachnodeandeachkey an m-bitidentifierusing SHA 1 (Secure Hash Standard). m = anynumberbigenoughtomakecollisionsimprobable Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) • Both are uniformly distributed • Both exist in the same ID space

  16. Challenges • System churn: machines can fail or exit the system any time • Scalability: need to scale to 10s or 100s of thousands machines • Heterogeneity: • Latency: 1ms to 1000ms • Bandwidth: 32Kb/s to 100Mb/s • Nodes stay in system from 10s to a year …

  17. The Chordalgorithm –ConstructionoftheChord ring • identifiersarearranged on a identifiercirclemodulo 2 => Chord ring m

  18. The Chordalgorithm –ConstructionoftheChord ring • a key k isassignedtothenodewhoseidentifierisequaltoorgreaterthanthekey‘sidentifier • thisnodeiscalledsuccessor(k) andisthefirstnodeclockwisefrom k.

  19. The Chord algorithm –Simple node localization // ask node n to find the successor of id n.find_successor(id) if (id (n; successor]) return successor; else // forward the query around the circle return successor.find_successor(id); => Number of messages linear in the number of nodes !

  20. The Chordalgorithm –Scalablenodelocalization • Additional routinginformationtoacceleratelookups • Eachnode n contains a routingtablewithupto m entries (m: numberofbitsoftheidentifiers) => finger table • i thentryin thetableatnode n containsthefirstnode s thatsucceds n byat least 2i-1 • s = successor (n + 2 i-1) • s iscalledthei th finger ofnode n

  21. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  22. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2i-1)

  23. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  24. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  25. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  26. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  27. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  28. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  29. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  30. The Chord algorithm –Scalable node localization Finger table: finger[i] = successor (n + 2 i-1)

  31. The Chordalgorithm –Scalablenodelocalization Importantcharacteristicsofthisscheme: • Eachnodestoresinformationaboutonly a smallnumberofnodes (m) • Eachnodesknowsmoreaboutnodescloselyfollowingitthanaboutnodesfareraway • A finger tablegenerallydoes not containenoughinformationtodirectlydeterminethesuccessorof an arbitrarykey k

  32. The Chord algorithm –Scalable node localization • Search in finger table for the nodes which most immediatly precedes id • Invoke find_successor from that node => Number of messages O(log N)!

  33. The Chord algorithm –Scalable node localization • Search in finger table for the nodes which most immediatly precedes id • Invoke find_successor from that node => Number of messages O(log N)!

  34. The Chord algorithm –Node joins and stabilization

  35. The Chord algorithm –Node joins and stabilization

  36. The Chord algorithm –Node joins and stabilization

  37. The Chordalgorithm –Nodejoinsandstabilization • Toensurecorrectlookups, all successorpointers must beuptodate • stabilizationprotocolrunningperiodically in thebackgroundandUpdates finger tablesandsuccessorpointers

  38. The Chordalgorithm –Nodejoinsandstabilization Stabilizationprotocol: • Stabilize(): n asksitssuccessorforitspredecessor p anddecideswhether p shouldben‘ssuccessorinstead (thisisthecaseif p recentlyjoinedthesystem). • Notify(): notifiesn‘ssuccessorofitsexistence, so itcanchangeitspredecessorto n • Fix_fingers(): updates finger tables

  39. The Chord algorithm –Node joins and stabilization

  40. The Chord algorithm –Node joins and stabilization • N26 joins the system • N26 aquires N32 as its successor • N26 notifies N32 • N32 aquires N26 as its predecessor

  41. The Chord algorithm –Node joins and stabilization • N26 copies keys • N21 runs stabilize() and asks its successor N32 for its predecessor which is N26.

  42. The Chord algorithm –Node joins and stabilization • N21 aquires N26 as its successor • N21 notifies N26 of its existence • N26 aquires N21 as predecessor

  43. The Chord algorithm –Impact of node joins on lookups • All finger tableentriesarecorrect => O(log N) lookups • Successorpointerscorrect, but fingersinaccurate=> correct but slowerlookups

  44. The Chordalgorithm –Impact ofnodejoins on lookups • Incorrectsuccessorpointers => lookupmightfail, retry after a pause • But still correctness!

  45. The Chordalgorithm –Impact ofnodejoins on lookups • Stabilizationcompleted => noinfluence on performence • Onlyforthenegligiblecasethat a large numberofnodesjoinsbetweenthetarget‘spredecessorandthetarget, thelookupisslightlyslower • Noinfluence on performanceaslongasfingersareadjustedfasterthanthenetworkdoubles in size

  46. The Chord algorithm –Failure of nodes • Correctness relies on correct successor pointers • What happens, if N14, N21, N32 fail simultaneously? • How can N8 aquire N38 as successor?

  47. The Chord algorithm –Failure of nodes • Correctness relies on correct successor pointers • What happens, if N14, N21, N32 fail simultaneously? • How can N8 aquire N38 as successor?

  48. The Chord algorithm –Failure of nodes • Eachnodemaintains a successorlistofsize r • Ifthenetworkisinitiallystable, andeverynodefailswithprobability ½, find_successor still findstheclosestlivingsuccessortothequerykeyandtheexpected time toexecutefind_succesoris O(log N) • Proofsare in theresearchpaper

  49. The Chordalgorithm –Failureofnodes Massive failures have little impact (1/2)6 is 1.6% Failed Lookups (Percent) Failed Nodes (Percent)

  50. What Can You Run in Cloud Computing? • Almost everything! • Virtual Machine instances • Storage services • Simple Storage Service (S3) • Elastic Block Storage (RBS) • Databases: • Database instances (e.g., mySQL, SQL Server, …) • SimpleDB • Content Distribution Network: CloudFront • MapReduce: Amazon Elastic MapReduce • …

More Related