490 likes | 671 Views
Robots at MySpace Scaling a .NET Website with Microsoft Robotic Studio. Erik Nelson Group Architect / enelson@myspace-inc.com Akash Patel Senior Architect / apatel@myspace-inc.com Tony Chow Development Manager / tchow@myspace-inc.com Core Platform MySpace.com.
E N D
Robots at MySpaceScaling a .NET Website with Microsoft Robotic Studio Erik Nelson Group Architect / enelson@myspace-inc.com Akash Patel Senior Architect / apatel@myspace-inc.com Tony Chow Development Manager / tchow@myspace-inc.com Core Platform MySpace.com
MySpace is the largest Social Network • … based in Los Angeles • The largest .NET website in the world • Can’t be big without Caching • >6 million requests/second to the middle tier at peak • Many TB of user generated cached data • Data must not be stale • Users hate that • New and interesting features require more than just a “cache” • Our middle tier is called Data Relay, because it does much more than just cache. • Data Relay has been in production since 2006!
CCR • What is CCR? • Coordination and Concurrency Runtime • Part of the Robotics Toolkit • Provides • Thread pools (Dispatcher) • Job queues (DispatcherQueue) • Flexible ways of connecting actions to those queues and pools (Ports and Arbiter)
Graphs are Cool Requests / Sec
The Stream • The stream is everywhere • The stream is extremely volatile data • Both the “who” and the “what” • Updates from our Users don’t just go to us • Twitter, Google, etc • ~60,000 Stream Queries per Second • Over 5 billion a day • 35 million updates a day • ~5 TB of Data in our Stream
Why not a DB? • We decided to be “publisher” based and not “subscriber” based • For us, that would involve a massively distributed query • Hundreds of databases • Decoupling writing from reading
OK So How Then? Robots!
Robots? • Lots of inputs and outputs! • Need for minimum latency and decoupling between jobs! • Just like a robot!
Abusing a Metaphor • Our robots must • Incorporate incoming messages • Tell their neighbors about any messages they receive • Be able to answer lots of questions • Talk to other robots when they need more info • Deal with other robots being slow or missing
How Does CCR Help? • Division of labor • Incorporate incoming messages • Tell their neighbors about any messages they receive • Be able to answer lots of questions • Talk to other robots when they need more info • Deal with other robots being slow or missing
How Does CCR Help? • Queue Control • We can has Buckets • Queue Division • Different destinations have their own queues • Strict Pool Control
Akash Patel Senior Architect
Activity Stream • Activity Stream (News Feed) • Aggregation of your friends activities • Activity Stream Generation • Explicitly: Status Update • Implicitly: Post New Photo Album • Auto: 3rd Party App
Friends & Activities Your Friends … You post a new status update .. an index is created You upload a new photo album .. Index Updated Index grows with new activities • Publisher Based Cache • - Activity Associated to Publishing User Where’s the Activity Stream? Imagine this is You …
Friends & Activities • Activity Stream Generated by Querying • Filter & Merge Friend’s Activities Very Volatile
Stream Architecture • Utilizes Data Relay Framework • Message Based System • Fire & Forget Msgs [Save, Delete, Update] • RoundTripMsgs [Get, Query, Execute] • Replication & Clustering Built-in • Index Cache • Not a Key/Value Store • Storage & Querying System • 2 Tiered System (separates index from data)
Data Relay Architecture Data is Partitioned across clusters C1 C1 N1 N2 N3 N4 N5 N6 N7 N8 N9 Cluster 1 Cluster 2 Cluster 3 Group A Data is Replicated within Clusters Group B Group Cluster Node
Stream Architecture C1 C1 N1 N2 N3 N4 N5 N6 N7 N8 N9 Cluster 1 Cluster 2 Cluster 3 Activities Index
Activity Stream Update New Activity Msg N1 N2 N3 N4 N5 N6 N7 N8 N9 Cluster 3 Cluster 1 Cluster 2
CCR Perspective N1 New Activity Msg Node 2 Proxy (Destination Node) Fire & Forget Msg Round Trip Msgs Port1 Port1 Arbiters Arbiters Dispatcher Queue Port2 Dispatcher Queue Thread Pool Thread Pool
Activity Stream Request Client Activity Stream Request Distributed Query - FriendList SubQuery FriendList1 SubQuery FriendList3 SubQuery – FriendList2 N1 N2 N3 N4 N5 N6 N7 N8 N9 Cluster 3 Cluster 1 Cluster 2
CCR Perspective Client Activity Stream Query Node 1 Proxy (Destination Node) Fire & Forget Msg Round Trip Msgs Port1 Port1 Arbiters Arbiters Dispatcher Queue Port2 Dispatcher Queue Thread Pool Thread Pool
Activity Stream Request Query Result Sub-Query Result3 Sub-Query Result1 Sub-Query Result2 N1 N2 N3 N4 N5 N6 N7 N8 N9 Cluster 3 Cluster 1 Cluster 2
Activity Stream Request Activity Stream Response Query Result Sub-Query Result3 Sub-Query Result1 Sub-Query Result2 N1 N2 N3 N4 N5 N6 N7 N8 N9 Cluster 2 Cluster 3 Cluster 1
Activity Stream Request Activity Stream Response Query Result Activity Index Cache C1 C1 Activities Data Cache
More Graphs Requests / Sec Stream Requests Index Gets
Index Cache • De Facto Distributed Querying Platform • Sort, Merge, Filter • Ubiquitous when Key/Value Store is not enough • Activity Stream • Videos • Music • MySpace Developer Platform
Robots Processing Your Every Move! • CCR constructs in every NodeProxy • Ports • Arbiters • Dispatcher Queues • Dispatchers (Shared) • Messages Batching • Arbiter.Choice • Arbiter.MultipleItemReceive • Arbiter.Receive from the TimeoutPort • ThreadpoolFlexibilty • Number of pools • Flexibility to set & change pool size dynamically*
Activity Stream • Activities are everywhere Twitter MySpace Google
Tony Chow Development Manager
Real-Time Stream • Pushes user activities out to subscribers using the PubSubHubbubstandard • Anyone can subscribe to the Real-Time Stream, free of charge • Launched in December 2009 • Major subscribers: Google, Groovy, OneRiot • ~100 million messages delivered per day
The Challenges • Protect the user experience • Constant stream to healthy subscribers • Give all subscribers a fair chance at trying • Prevent unhealthy subscribers from doing damage
Policing the Stream • Queue • Partition • Throttle • Async I/O
Policing the Stream • So far so good—for occasionally slow subscribers • But chronically underperforming subscribers call for more drastic measures
Policing the Stream • Discard • Unsubscribe
Transaction Manager is Everywhere @ MySpace! • Generic platform for reliable persistence • Supports SQL, SOAP, REST, and SMTP calls • MySpace Mail • Friend Requests • Status/Mood Update • And much more!
The Role of CCR • CCR is integral to DataRelay • CCR Iterator Pattern for Async I/O
Asynchronous I/O • Synchronous I/O • Needs lots of threads to do lots of I/O • Massive context switching • Doesn’t scale • Asynchronous I/O • Efficient use of threads • Massively scales • Hard to program, harder to read • Gnarly and unmaintainable code
The CCR Iterator Pattern • A better way to do write async code • C# Iterators—makes enumerators easier • CCR Iterators—makes async I/O easier • Makes async code look like sync code
The Diffference IEnumerable<ITask> After() { cmd1.BeginExecuteNonQuery(result=>port.Post(1)); yield return Arbiter.Receive(...); cmd1.EndExecuteNonQuery(); • cmd2.BeginExecuteNonQuery(result=>port.Post(1)); • yield return Arbiter.Receive(...); • cmd2.EndExecuteNonQuery(); } void Before() { cmd1.BeginExecuteNonQuery( result1=> { cmd1.EndExecuteNonQuery(); cmd2.BeginExecuteNonQuery( result2=> { cmd2.EndExecuteNonQuery(); }); }); }
The CCR Iterator Pattern • Improves readability and maintainability • Far less bug-prone • Indispensible for asynchronous programming
What Now? • We didn’t show any code samples… • Because we are going to share more than samples … WE ARE OPEN SOURCING!!
Open Source • http://DataRelay.CodePlex.com • Lesser GPL License for… • Data Relay Base • Our C#/Managed C++ Berkeley DB Wrapper and Storage Component • Index Cache System • Network transport • Serialization System
What Now? • Places in our code with CCR • Bucketed batch • \Infrastructure\DataRelay\RelayComponent.Forwarding\Node.cs - ActivateBurstReceive(int count) • Distributed bulk message handling • \Infrastructure\DataRelay\RelayComponent.Forwarding\Forwarder.cs - HandleMessages • General Message Handling • \Infrastructure\DataRelay\DataRelay.RelayNode\RelayNode.cs • \Infrastructure\SocketTransport\Server\SocketServer.cs
Evaluate Us! Please fill out an evaluation for our presentation! More evaluations = more better for everyone.
Thank You! Questions? • Erik Nelson • enelson@myspace-inc.com • Akash Patel • Apatel@myspace-inc.com • Tony Chow • Tchow@myspace-inc.com • http://DataRelay.CodePlex.com