Bigger, Better, Faster, More

Bigger, Better, Faster, More An Introduction to Super-Scalability

But first… The Arms Race

1 ENIAC 1 Teletype

1 Mainframe N Terminals

N Servers N Terminals

N Servers N PCs

N Web Servers N Browsers

N Web Servers N AJAX Apps

N Clusters N AJAX Apps

N Clusters N*M Phones

N Cloudlets N*M Phones

And So On…

What is Scalability?

Scalability = Ability to do More

More What?

More Processing

Processing Takes Resources

Types of Resources Network CPU Disk Memory

Types of Utilization Time / Throughput Space / Capacity

Types of Utilization Time / Throughput Space / Capacity Complexity Locking

Resources & Utilization

We Want More! (but how to scale?)

How to Scale Just make it bigger (vertical scaling)

We Want Even More! (super-scalability)

Scaling Strategies

Bigger (Space) Not Super Partitioning Sharding / Hashing Growth = Add Partition Tradeoff: Splitting Partitions Tradeoff: Redundancy becomes a distribution problem • One big data store • One big memory store • Make it bigger • Make it redundant • E.g. Full activity logging A B C …

Better (Complexity) Not Super Distribution Chop up problem / workload Map/Reduce Tradeoff: coordination Tradeoff: network • Number of objects increase • As relations increase, add time or space requirements • Common with graph problems • E.g. PageRank

Faster (Time) Not Super Optimization As fast as possible Can’t scale as fast as growth Specialization – ONE thing Caching - Reduces work in trade for space Tradeoff: space Tradeoff: coordination • Tune your code • Tune your database • Tune your network • Better hardware

More (Locking) Not Super Parallelizing / Estimating Separate reads & writes Non-locking estimation Reduce contention Tradeoff: space Tradeoff: coordination • One at a time • Serialized access

But Which Technique(s)?

It Depends!

All: Divide & Conquer • Partitions: Data & Processing • Sharding • Worker Processes • Coordination: Distribution & Ordering • Queues & Managers • Separate Read/Write Access • What does this make the system look like?

And now… Some Theory

ACID: reliable transaction systems • Atomicity – all or nothing • Consistency – always correct • Isolation – changesets executed independently • Durability – once committed, stays so Really hard to scale in one big block (although SSDs + RAM helps!)

Maybe It’s Not so Important? (it depends)

BASE is easier • Basically Available • Soft State • Eventual Consistency • A node will either eventually get a change or retire • Well…still need conflict resolution BASE is NOT ACID (get it?)

Can we have a Balanced pH?

CAP Theorem • Choose TWO: • Consistency • Availability • Partition tolerance Manager Double Outage! Double Outage! Replica 1 Replica 2 Client 1 Client 2

Designing a scalable system

It Depends!

Understand Your Scale Points • Log • Profile • Tune • Test • Divide • Compare • Partition • No, really, log a lot

Fallacies of Distributed Computing • The network is reliable. • Latency is zero. • Bandwidth is infinite. • The network is secure. • Topology doesn't change. • There is one administrator. • Transport cost is zero. • The network is homogeneous.

Some “Scaly” Tools

CQRS Pattern • Separate operations for: • Command – perform an action • Query – returns data about state • Promotes simpler programs • Allows Command Queues • Reduces locking

A Scaly Stack

Infrastructure as a Service

Platform as a Service

Application as a Service • Salesforce? • (Also sort of a platform) • Whateva!

Cassandra An Example

Bigger, Better, Faster, More