200 likes | 332 Views
Presented by Bin Zhou Bin Zhou, Jibo Xie, Chaowei Yang Joint Center for Intelligent Spatial Computing George Mason University. Grid Platform for Geospatial Applications & Fine Granule Scheduler. Agenda. Grid Computing Introduction CISC & SURA Grid Geospatial Applications Require Grid
E N D
Presented by Bin Zhou Bin Zhou, Jibo Xie, Chaowei Yang Joint Center for Intelligent Spatial Computing George Mason University Grid Platform for Geospatial Applications & Fine Granule Scheduler
Agenda • Grid Computing Introduction • CISC & SURA Grid • Geospatial Applications Require Grid • CISC Fine Granule Scheduler • Architecture,Strategy • Progress Status
Grid Computing Introduction • Definition • Grid computing is an emerging computing infrastructure that treats all resources as a collection of manageable entities with common interfaces to such functionality as lifetime management, discoverable properties and accessibility via open protocols – wikipedia • Popular Grid Middleware • Condor • Globus • Condor-G • Unicore
GMU grid environment • SURAgrid GMU CISC GMU Grid can access the computing resources contributed by SURAgrid member universities
GMU grid environment LambdaRail GMU CISC Grid can setup 1-10Gbps connection to any of the LamdaRail supported Universities, Agencies, and Centers, such as GSFC & SDSC
Geospatial Requirements • Large Data Set • Map Data, Sensor Data, in Tera-bytes • Reliability,Interoperability • collaboration • Intensive Computation • More Complex Algorithms • Adaptive Algorithms • Intelligent Processing
Grid Computing Could Satisfy these requirements • Reliable File Transfer • Resource Management and Allocation • Authorization & Control • Job Control • Web Service Oriented
Detecting Watersheds from multi-scale DEM • Watershed boundaries are not known before processing massive data • extract coarse watershed boundaries from multi-scale DEM • Using the boundaries to decompose the massive data with some redundancy Extraction resample Xie 2006
Use 24 units to test the speed up (each unit is 3.08M) (Xie 2006)
CISC Test Applications Real Time Routing Test Result: Job Amount 30 30 30 30 CPUs 1 10 20 30 Executing Time 1686s 374s 322s 293s Speed Up 1 4.5 5.2 5.75 Efficiency 1 0.45 0.26 0.19 The efficiency decreases with the CPU numbers because the overhead increase, but the major problem is Condor can’t handle small jobs efficient. Demonstrates the need for fine granule scheduler
Specific Applications: Fine-Grained Near Real Time Jobs • Fine-Grained • Very Short Executing Time • Huge Amount • Job Similarity • Near Real Time • Sensitive to scheduling latency • example: Real-Time Routing, Short-Time stock prediction, Condor cannot be used for tasks that require less than 3.5 min to complete ---Gregg Cooke, IT Technical Council ,"Evaluating Condor for Enterprise Use: A UBS Case Study"
CISC Scheduler • Purpose • improve near real time job response time • improve mass Fine Granularity job throughput • Scheduling Strategy • Short Communicating Message • Simple Match-Making Function • Dynamic Index • Multi-Dispatch
System Architecture Worker Central Manager User Interface Abstract Interface /APIs Services Container Algorithm module Collector Submitter Dispatcher Resource Manager Lib File Transfer Message passing Process Memory Other TCP/UDP Socket System Function
Prototype Overhead Test • Test Case • Insertion Sort 200,000 integers • Dataset: 5.56M • Execute File : 1.8M • Test Platform • OS: ubuntu 6.10 Network: 100Mbps • CPU: Celeron M 1.6G Memory: 1G
Thanks Questions?