Low Latency Computations on Massive Data

Low Latency Computations on Massive Data Ion Stoica CS Division, UC Berkeley Fujitsu Symposium Mountain View, June 5, 2013 UC BERKELEY

Challenges • Data grows faster than Moore’s law* • Data is dirty • uncurated, no schema, no consistent syntax and sematics • Complex questions, e.g., • Is there a virus outbreak? • Is the building structurally safe? *[IDC report, Kathy Yelick, LBNL]

Low Latency & Massive Data • May not be able to achieve both of them! • Even if all data in memory, computation may take tens of seconds

Key Insight Answers don’t always need to be exact • Input often noisy:exact computations do not guarantee exact answers • Error often acceptable if small and bounded Best scale ± 0.5lb error Speedometers ± 2.5 % error (edmunds.com) OmniPod Insulin Pump ± 0.96 % error (www.ncbi.nlm.nih.gov/pubmed/22226273)

Error-bounded Computations • Error depends on sample size (S) not on original data size: • error ~ • E.g., error of a poll on 1,000 people is “same” for a population of 1M or 100M people New generation of scale-independent algorithms

What Does It Mean? • Can trade between answer’s latency and accuracy • Data rapid increase no longer a problem…

What Does It Mean? • Can trade between answer’s latency and accuracy • Data rapid increase no longer a problem… Moore’s Law  error halves every two years

Low Latency Computations on Massive Data

Low Latency Computations on Massive Data

Presentation Transcript

Low-Latency Networks for Financial Applications

Data Latency

Low-Latency Adaptive Streaming Over TCP

Low Latency Networking

On Availability of Intermediate Data in Cloud Computations

Stencil Computations on CPUs

High -Fidelity Latency Measurements in Low -Latency Networks

Sparrow Distributed , Low Latency Scheduling

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture

Low latency via redundancy

Low-Latency Pipelined Crossbar Arbitration

LOLA Together, on demand! LOw LAtency Audio Visual Streaming System

Presentation On SurfNoC: A Low Latency and Provably Non-Interfering

Low-Latency FIFO’s Using Token Rings

Low Latency Messaging Over Gigabit Ethernet

Lower bounds on data stream computations

Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems

Attacks on Low-Latency Anonymous Network: TOR

Delivering Capacity, Low Latency and Low Jitter

Achieving Low Latency, Reduced Memory Footprint and Low Power Consumption with Data Streaming

Low Latency Rendering with Dataflow Architectures

Low Latency Server