560 likes | 665 Views
Using Abstractions to Scale Up Applications to Campus Grids. Douglas Thain University of Notre Dame 28 April 2009. Outline. What is a Campus Grid? Challenges of Using Campus Grids. Solution: Abstractions Examples and Applications All-Pairs: Biometrics Wavefront: Economics
E N D
Using Abstractionsto Scale Up Applicationsto Campus Grids Douglas Thain University of Notre Dame 28 April 2009
Outline • What is a Campus Grid? • Challenges of Using Campus Grids. • Solution: Abstractions • Examples and Applications • All-Pairs: Biometrics • Wavefront: Economics • Assembly: Genomics • Combining Abstractions Together
What is a Campus Grid? • A campus grid is an aggregation of all available computing power found in an institution: • Idle cycles from desktop machines. • Unused cycles from dedicated clusters. • Examples of campus grids: • 600 CPUs at the University of Notre Dame • 2000 CPUs at the University of Wisconsin • 13,000 CPUs at Purdue University
Provides robust batch queueing on a complex distributed system. • Resource owners control consumption: • “Only run jobs on this machine at night.” • “Prefer biology jobs over physics jobs.” • End users express needs: • “Only run this job where RAM>2GB” • “Prefer to run on machines http://www.cs.wisc.edu/condor
The Assembly Languageof Campus Grids • User Interface: • N x { run program X with files F and G } • System Properties: • Wildly varying resource availability. • Heterogeneous resources. • Unpredictable preemption. • Effect on Applications: • Jobs can’t run for too long... • But, they can’t run too quickly, either! • Use file I/O for inter-process communication. • Bad choices cause chaos on the network and heartburn for system administrators.
I have 10,000 iris images acquired in my research lab. I want to reduce each one to a feature space, and then compare all of them to each other. I want to spend my time doing science, not struggling with computers. I own a few machines I can get cycles from ND and Purdue I have a laptop. Now What?
Observation • In a given field of study, a single person may repeat the same pattern of work many times, making slight changes to the data and algorithms. • If we knew in advance the intended pattern, then we could do a better job of mapping a complex application to a complex system.
Abstractionsfor Distributed Computing • Abstraction: a declarative specification of the computation and data of a workload. • A restricted pattern, not meant to be a general purpose programming language. • Usesdata structures instead of files. • Provide users with a bright path. • Regular structure makes it tractable to model and predict performance.
All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j A1 A2 A3 A1 A1 allpairs A B F.exe An AllPairs(A,B,F) B1 F F F B1 B1 Bn B2 F F F F B3 F F F Moretti, Bulosan, Flynn, Thain, AllPairs: An Abstraction… IPDPS 2008
F F 0.97 0.05 Example Application • Goal: Design robust face comparison function.
F Similarity Matrix Construction Current Workload: 4000 images 256 KB each 10s per F (five days) Future Workload: 60000 images 1MB each 1s per F (three months)
Try 1: Each F is a batch job. Failure: Dispatch latency >> F runtime. Try 2: Each row is a batch job. Failure: Too many small ops on FS. F F F F F CPU CPU CPU CPU CPU F F F F F F F F F F CPU F CPU F CPU F CPU F CPU F F F F F F HN HN Try 3: Bundle all files into one package. Failure: Everyone loads 1GB at once. Try 4: User gives up and attempts to solve an easier or smaller problem. F F F F F F F F F F CPU F CPU F CPU F CPU F CPU F F F F F F HN Non-Expert User on a Campus Grid
All-Pairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j A1 A2 A3 A1 A1 An AllPairs(A,B,F) B1 F F F B1 B1 Bn B2 F F F F B3 F F F
An Interesting Twist • Send the absolute minimum amount of data needed to each of N nodes from a central server • Each job must run on exactly 1 node. • Data distribution time: O( D sqrt(N) ) • Send all data to all N nodes via spanning tree distribution: • Any job can run on any node. • Data distribution time: O( D log(N) ) • It is both faster and more robust to send all data to all nodes via spanning tree.
What’s the right metric? • Speedup? • Seq Runtime / Parallel Runtime • Parallel Efficiency? • Speedup / N CPUs? • Neither works, because the number of CPUs varies over time and between runs. • Better Choice: Cost Efficiency • Work Completed / Resources Consumed • Cars: Miles / Gallon • Planes: Person-Miles / Gallon • Results / CPU-hours • Results / $$$
x F d y Wavefront ( R[x,0], R[0,y], F(x,y,d) ) R[0,4] R[2,4] R[3,4] R[4,4] x F d y R[0,3] R[3,2] R[4,3] x x F F d y d y R[0,2] R[4,2] x F x F d y d y R[0,1] x F x F x F x F d y d y d y d y R[0,0] R[1,0] R[2,0] R[3,0] R[4,0]
The Performance Problem • Dispatch latency really matters: a delay in one holds up all of its children. • If we dispatch larger sub-problems: • Concurrency on each node increases. • Distributed concurrency decreases. • If we dispatch smaller sub-problems: • Concurrency on each node decreases. • Spend more time waiting for jobs to be dispatched. • So, model the system to choose the block size. • And, build a fast-dispatch execution system.
Block Size = 2 x F d y Wavefront ( R[x,0], R[0,y], F(x,y,d) ) R[0,4] R[2,4] R[3,4] R[4,4] x F d y R[0,3] R[3,2] R[4,3] x x F F d y d y R[0,2] R[4,2] x F x F d y d y R[0,1] x F x F x F x F d y d y d y d y R[0,0] R[1,0] R[2,0] R[3,0] R[4,0]
100s of workers dispatched to Notre Dame, Purdue, and Wisconsin worker worker worker worker worker worker queue tasks put F.exe put in.txt exec F.exe <in.txt >out.txt get out.txt wavefront work queue worker tasks done F In.txt out.txt
100s of workers dispatched to Notre Dame, Purdue, and Wisconsin worker worker worker worker worker worker queue tasks put F.exe put in.txt exec F.exe <in.txt >out.txt get out.txt wavefront work queue worker tasks done wavefront In.txt out.txt F F
Chemical Sequencing AGTCGATCGATCGAT Millions of “reads” 100s bytes long. TCGATAATCGATCCTAGCTA AGCTAGCTACGA Computational Assembly The Genome Assembly Problem AGTCGATCGATCGATAATCGATCCTAGCTAGCTACGA AGTCGATCGATCGAT TCGATAATCGATCCTAGCTA AGCTAGCTACGA
Assemble( set S, Test(), Align(), Assm() ) Sequence Data 0. AGCCTGCATTA… 1. CATTAACGAAC… Assembled Sequence 2. GACTGACTAGC… AGTCGATCGATCGATAATC… 3, TGACCGATAAA… I/O Bound RAM Bound Test Assem Align Candidate Pairs 0 is similar to 1 1 is similar to 3 1 is similar to 4 CPU Bound List of Alignments 0. AGCCTGCATTA 1. CATTAACGAAC…
Distributed Genome Assembly 100s of workers dispatched to Notre Dame, Purdue, and Wisconsin worker worker worker worker worker test worker queue tasks detail of a single worker: put align.exe put in.txt exec F.exe <in.txt >out.txt get out.txt align master work queue worker tasks done align.exe in.txt out.txt assemble
What’s the Upshot? • We can do full-scale assemblies as a routine matter on existing conventional machines. • Our solution is faster (wall-clock time) than the next faster assembler run on 1024x BG/L. • You could almost certainly do better with a dedicated cluster and a fast interconnect, but such systems are not universally available. • Our solution opens up research in assembly to labs with “NASCAR” instead of “Formula-One” hardware.
What Other AbstractionsMight Be Useful? • Map( set S, F(s) ) • Explore( F(x), x: [a….b] ) • Minimize( F(x), delta ) • Minimax( state s, A(s), B(s) ) • Search( state s, F(s), IsTerminal(s) ) • Query( properties ) -> set of objects • FluidFlow( V[x,y,z], F(v), delta )
How do we connect multiple abstractions together? • Need a meta-language, perhaps with its own atomic operations for simple tasks: • Need to manage (possibly large) intermediate storage between operations. • Need to handle data type conversions between almost-compatible components. • Need type reporting and error checking to avoid expensive errors. • If abstractions are feasible to model, then it may be feasible to model entire programs.
A1 A2 A3 B1 F F F B2 F F F B3 F F F Connecting Abstractions in BXGrid S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) eye color S1 L brown F L blue ROC Curve S2 F R brown S3 F R brown Bui, Thomas, Kelly, Lyon, Flynn, Thain BXGrid: A Repository and Experimental Abstraction… poster at IEEE eScience 2008