110 likes | 286 Views
incite-ful. Life Beyond Distributed Transactions An Apostate’s Opinion. By Pat Helland Amazon.Com Jan 8 th , 2007. Apostate : noun “ One who renounces a previously held belief.”. Today’s Goal: Offer hopefully insightful opinions about scaleable apps. Want Scale-Agnostic Applications
E N D
incite-ful Life Beyond Distributed TransactionsAn Apostate’s Opinion By Pat Helland Amazon.Com Jan 8th, 2007 Apostate: noun “One who renounces a previously held belief.” Today’s Goal: Offer hopefully insightful opinions about scaleable apps.
Want Scale-Agnostic Applications • Two layers to the application: scale-agnostic and scale-aware • Consider Scale-Agnostic API Scale Agnostic Code Application Scale Agnostic API Upper Layer Lower Layer Scale-Aware-Code Assumptions(Don’t Gotta Prove These… Just Plain Believe Then) • Grown-Ups Don’t Use Distributed Transactions • The apps using distributed transactions become too fragile… • Let’s just consider local transactions. • Multiple Disjoint Scopes of Serializability • Want Almost-Infinite Scaling • More of everything… Year by year, bigger and bigger • If it fits on your machines, multiply by 10, if that fits, multiply by 1000… • Strive to scale almost linearly (N log N for some big log).
The Application’s Data Is Factored into Entities, Eachof Which Has a Unique Key Entity Key = “ABC” Entity Key = “WPB” Each Entity Will Reside on a Single Machine(Ignoring Replication & H/A) Entity Key = “QLA” Entity Key = “UNB” Uniquely Keyed Entities • Not All Data May Be in a Single Transaction • We Must Collect the Data into Pieces • We Must Annotate the Boundaries of the Data Guaranteed to Be Transactional • Must Remain Transactional Even If We Repartition! • An Entity: • A Collection of Data that Fits on a Single Machine • Identified by a Unique Key • Assume the Scale-Aware-Code Never Partitions an Entity • The Unique Key Defines the Data that Can’t Be Partitioned
Entity “ABC” Entity “DEF” Transaction Transactions and Entities • A Transaction May Update a Single Entity • The Scale-Aware-Code (and API) Guarantee It • The Entity Is Never Partitioned • A Transaction Must Not Ever Update Two Entities • Even If the Two Live on One Machine Today • Tomorrow, They May Repartition to Different Machines…
Repartitioning and Entities • Entities Allow Scaling • Entities Remain Intact Even when Repartitioning • The Application Can Count on the Integral Nature of Each Entity • It Is OK to Know that the Entire Entity Is Local • It Is OK to Work on Anything in the Entity at Once Entity “ABC” Entity “ABZ” Entity “JKL” Entity “FXQ” Entity “GHI” Entity “MOE” Entity “NAO” No Promisethat TwoDifferentEntities Stayon the SameMachine!! Entity “DEF” Entity “LMN” Entity “JAA” Entity “RST” Entity “JKL” Entity “RAA” Entity “RST” Entity “GHI” Entity “LMN” Entity “EFG” Entity “FAW” Entity “XYZ” Entity “KZU” Entity “LMN” Entity “XYZ” Entity “XYZ”
PK:123 A1:ABC A2:aba Entity Keys Indexed by 2nd Alternate Key Entities Indexed by Unique Key Entity Keys Indexed by 1st Alternate Key PK:217 A1:DEF A2:def A1:GHI A2:fgh PK:332 PK:589 A2:ghu A1:JKL PK:719 A2:klw A1:MNO Thinking about Alternate Indices • Entities Must Have a Unique Key • Unless the You Begin with the Same Key, You Aren’t the Same • CANNOT Guarantee the Alternate Index Will Co-locate with the Entity’s Primary Key • By Definition, Alternate Indices Don’t Have the Same Key! • We Must Index Them with a Different Key… • Alternate Indices CANNOT Be Updated in the Same Transaction as the Primary Data • There Is No Way to Guarantee They Are on the Same Machine • They Must Be Updated in Different Transactions…
Entity-X Entity-Y Send To: Entity-Y Boundary of Transactions Boundary of Transactions Entities Are Connected by Messaging • Entities Are Key-Named Boundaries for Transactional Work • Transactions Never Span Entities • The Scale-Aware-Code May Move Them to Repartition • The Only Way to Communicate across Entities Is with Messaging! • The Scale-Aware-Code Is Responsible for Finding the Correct Entity (by Key-Name) and for Routing the Message to It “Messaging” Is in Quotes… Work Is Invoked -- Potentially across Machines -- Definitely across Transactions!
Entity-X Send To: Entity-A Send To: Entity-B Send To: Entity-C From: Entity-B From: Entity-C From: Entity-D From: Entity-A Messages Connect Entities • Messages Are the Only Way into and out of Entities • They Are Produced by Transactions • They Are Consumed by Transactions • Transactions Are Local to the Entity
Entity-W Entity-X Entity-Y Entity-Z Entities Connected by Partnerships • Mostly, Messaging Occurs between Two Partner Entities • Usually, a Two-Way Exchange Moving Both Entities’ State • Each Keeps Data about How Far Its State Has Advanced…
Activity-W Activity-Y Activity-X Activity-X Activity-Z Activity-X Tracking a Partner with Activities • Activity Refers to the Knowledge about a Partner Entity • Descriptions of What Messages Have Been Received • Descriptions of What Obligations Exist to the Partner • The Foundation for Workflow to Replace Distributed Transactions • Two Basic Observations Wrapped Up in the “Activity” Concept • Work Across Entities Is Workflow Based on Two-Party Relationships • The Granularity of the Workflow Participant Is an Entity (Fine-Grained) Entity-W Entity-X Entity-Y Entity-Z
Almost-InfiniteScaling An Environment Demanding Rapidly IncreasingData and Computation Over Time Scale-Agnostic App An Application that Does Not Need to Change to Support Almost-Infinite Scaling Entity A Collection of Data Referenced by a Single Key;Transactional Scope of the Scale-Agnostic App Activity Data Used Inside One Entity to Describe Its Workflow State with a Single Partner Entity Alternate IndicesAren’t TransactionallyConsistent As Scale Increases, The Primary and AlternateIndices Cannot Be Guaranteed to Live Together Entities CooperateUsing Fine-GrainedTwo-Party Workflow No Dist-Txs Workflow; Workflow Participants Are Entities; Work Coordinated across Pairs Vocabulary and Assertions New Vocabularyfor Discussing Scale Assertionsabout Large Scale Apps