190 likes | 344 Views
Sprint: Speculative Prefetching of Remote Data. Arun Raman Princeton University. Greta Yorsh ARM, UK. Martin Vechev IBM Research&ETH Zurich. Eran Yahav Technion , Israel. Acknowledgments: Nick Mitchell and Mark Wegman IBM Research. IBM Yellow Pages Application. Local
E N D
Sprint: Speculative Prefetching of Remote Data Arun Raman Princeton University Greta Yorsh ARM, UK Martin Vechev IBM ResearchÐ Zurich EranYahav Technion, Israel Acknowledgments: Nick Mitchell and Mark Wegman IBM Research
IBM Yellow Pages Application Local Processing (2 sec) Remote Processing (2 sec) Network Latency (16 sec) Ted Chandra Walter Prefetching, Caching Async, Batching Query Planning, Cache Opti. Heywood Frank Rama Ralph Dimitri David Remote Access Latency Client Datasource
Sprint (Our Technique) Expose Parallelism across remote accesses Compiler Execution engine Optimize remote accesses
IBM Yellow Pages Application • Node build(String email) { • Employee emp = getEmployee(email); • if (!emp) return NULL; • Node root = new Node(emp); • numNodes++; • for(reportee_email: emp.getReportees()){ • Node child = build(reportee_email); • if (child) { • root.addToList(child); • child.setParent(root); • } • } • return root; • } Remote Access Ted Ted Chandra Chandra Walter Walter Heywood Heywood Frank Frank Rama Rama Ralph Ralph Dimitri Dimitri Local Dependency Remote Dependency David David
Program Remote Data- source input output Optimist (prefetcher program) Sprint execution engine Remote Data- source cache Pessimist(original program) input output
Compiler Transformations • Parallelization • Memory Protection • Output Protection Optimist (prefetcher program) • Initiating the Optimist • Deadlock Avoidance Pessimist(original program)
Sprint (Our Technique) Expose Parallelism across remote accesses Compiler Execution engine Optimize remote accesses
IBM Yellow Pages Application • Node build(String email) { • Employee emp = getEmployee(email); • if (!emp) return NULL; • Node root = new Node(emp); • numNodes++; • for(reportee_email: emp.getReportees()){ • Node child = build(reportee_email); • if (child) { • root.addToList(child); • child.setParent(root); • } • } • return root; • } build(K) { V = get(K) for(k in V.keys) build(k) }
Pessimist Optimist build(K) { V = get(K) for(k in V.keys) build(k) } Core 0 Core 1 Core 2 Sprint Cache Key Value St A A A A A A A St (State): Absent, Present, or Issued
Pessimist Optimist build(K) { V = get(K) for(k in V.keys) build(k) } Core 0 Core 1 Core 2 t3 t2 t1 t0 launch build(K0) build(K0) get(K0) get(K0) WAIT Sprint Cache Key Value St V0 K0 I P A build(K1) build(K2) build(K1) get(K2) A get(K1) V1 I P K1 WAIT get(K1) V2 K2 I A P V3 P K3 A I build(K2) build(K4) build(K3) HIT! P I A V4 K4 get(K2) get(K4) get(K3) build(K3) A HIT! get(K3) A St (State): Absent, Present, or Issued
Original Execution Local Processing (2 sec) Remote Processing (2 sec) Network Latency (16 sec) Sprint-ed Execution (2 sec) (2 sec) (3 sec) Client Datasource
In the paper: Batching optimization Task prioritization optimization Data access processing algorithm Data consistency with remote updates Correctness proof
Datasources Clients • IBM’s Yellow Pages Web Service • Publications Database (DB2) • Facebook Web Service • Management Hierarchy • Employee Search • Citation Count • Bibliography Agg. • Friend Connectivity
Datasources Clients IBM’s Yellow Pages Web Service Publications Database (DB2) Facebook Web Service Management Hierarchy Employee Search Citation Count Bibliography Agg. Friend Connectivity Cache Statistics (for Sprint with all optimizations turned on)
Task Priority P1() { x=read(M,a); y=read(M,b); assert (y > x);} P2() { atomic{ write(M,a,2); write(M,b,3) } } S = {(⟨a, 1⟩, ⟨b, 2⟩), (⟨a, 1⟩, ⟨b, 3⟩), (⟨a, 2⟩, ⟨b, 3⟩)} read(b) // by Optimist of P1 write(a,2),write(b,3) // by P2 read(a) // by Optimist of P1 read(a),read(b) // by Pessimist of P1 S′ = (⟨a, 2⟩, ⟨b, 2⟩)