230 likes | 392 Views
Liang Chen Gagan Agrawal Computer Science & Engineering Ohio State University. Supporting a Volume Rendering Application on a Grid-Middleware For Streaming Data. Introduction- Motivation. What is data steam Data stream: data arrive continuously
E N D
Liang Chen Gagan Agrawal Computer Science & Engineering Ohio State University Supporting a Volume Rendering Application on a Grid-Middleware For Streaming Data
Introduction-Motivation • What is data steam • Data stream: data arrive continuously • Enormous volume and must be processed online • Need to be processed in real-time • Data sources could be distributed • Data Stream Applications: • Online network intrusion detection • Sensor networks • Network Fault Management system for telecommunication network elements
X Introduction-Motivation Network Fault Management System (NFM) analyzing distributed alarm streams Switch Network NFM (Network Fault Management) System
Switch Network X Introduction-Motivation • Challenges • Data and/or computation intensive • System can be easily overloaded
Switch Network Introduction-Motivation • Possible solutions • Grid computing technologies • Automatically adjust processing rate
Introduction-Our Approach • We implemented a middleware to meet the needs • Previous work: 1. Utilizing existing grid standards Liang Chen, K. Reddy and G. Agrawal “GATES: A Grid-Based Middleware for Processing Distributed Data Streams”.HPDC, 2004. 2. Providing self-Adaptation functionality Liang Chen and G. Agrawal “Supporting Self-Adaptation in Streaming Data Mining Applications”. IPDPS, 2006. 3. Supporting automatic resource allocation Liang Chen and G. Agrawal “A Static Resource Allocation Framework for Grid-Based Streaming Applications”. Concurrency Computation: Practice and Experience Journal, Volume 18, Issue 6 , Pages 653 - 666. 4. Supporting efficient dynamic migration Liang Chen, Q. Zhu and G. Agrawal “A Supporting Dynamic Migration in Tightly Coupled Grid Applications”. SC 2006.
Roadmap • Introduction • GATES Overview • Adaptive Volume Rendering • Conclusions
GATES Architecture and Design • Use Globus Toolkit, built on OGSA • Allows users to specify their algorithms implemented in Java • Take care of plugging user-defined algorithms into the system and running them in Grid. • Applications need be broken down into a number of pipelined stages
A B C Stage A :Buffers for applications Stage B Stage C :Queues between Grid services :GATES services :Stages of an application System Architecture and Design(Architecture) Application Stage A Stage B Stage C
System Architecture and Design(GATES API Functions) Public class Second-Stage implements StreamProcessing { … void work(buffer in, buffer out) { … while(true) { DATA = GATES.getFromInputBuffer(in); Inter-Results = Processing(Data); GATES.putToOutputBuffer (out, Inter-Results); } } }
Performance Parameter Processing rate Accuracy Accuracy Parameter Processing rate Accuracy Adaptation Parameter • Definition: • A parameter in an application • Changing the parameter’s value can change processing rate of the application, also impact accuracy of the processing • Two kinds of adaptation parameters • Performance parameter • Accuracy parameter • Example • Sampling rate is an accuracy parameter
Pseudo Codes Again with Self-adaptation API Functions • Public class Second-Stage implements StreamProcessing • { • … • //Initialize sampling-rate • Sampling-rate = (Max+ Min)/2; • void work(buffer in, buffer out) • { • GATES.specifyAccuracyPara(Sampling-rate, Max, Min); • while(true) • { • DATA = GATES.getFromInputBuffer(in); • Inter-Results = Processing(Data, Sampling-rate); • GATES.putToOutputBuffer (out, Inter-Results); • Sampling-rate = GATES.getSuggestedValue(); • } • } • }
Adaptive Volume Rendering • Motivation – Grid computing is needed • Visualization involves large volumes of dataset • We focus on streaming volume data • Interactively visualizing volume data in real-time is needed • Computationally intensive • Resources consumed • Real-time processing can not be guaranteed • The places where data are generated are distributed • Typical client-server architecture is not scalable • Network bandwidths of wide-area networks are low • Computing capability of normal desktop is not enough • Grid techniques would be a good solution • Divide the procedure into stages organized in a pipeline • Allocate nodes close to data source to pre-process volume data • The size of intermediate results is much smaller
Adaptive Volume Rendering • Motivation – GATES is desirable • Automatic adaptation is desirable • Volume rendering algorithms running on a grid need to be highly adaptive • Adaptation usually achieved by manually adjusting adaptation parameters • Such manual parameter adaptation is very challenging in a grid environment • Automatic resource allocation is desirable • Grid environment is highly changeable • The GATES middleware could fulfill the needs • Grid-based • Provide the self-adaptation function to applications • Automatically allocate Grid resources
Adaptive Volume Rendering • Overall design • Two pipelined steps – the first step: • Build octrees from volume data • Octree is a tree data structure, in which each internal node has up to 8 children • Here, we use an octree to represent multiresolution information for a volume • Procedure to build an octree for a volume is as follows: • Divide volume space into 8 subvolumes and create 8 children nodes • For each subvolume, calculate standard deviation of all voxels in the subvolume, and store the deviation to the corresponding child node • If the deviation is larger than a pre-defined value, divide the subvolume, repeat the above procedure. Otherwise, stop
Adaptive Volume Rendering • Overall design • Two pipelined steps – the second step: • Use an octree and its corresponding volume to render images • Provided an error tolerance (or user-defined resolution), use DFS to traverse the octree and stop at the nodes where the deviation is less than the resolution or error tolerance. • Project the corresponding 3D-subvolumes to an image
Adaptive Volume Rendering • Make the rendering self-adaptive • Two adaptation parameters used in the third stage • Error Tolerance – performance parameter • Image Size – accuracy parameter • Only one adaptation parameter can be adjusted by GATES. So we fix one and adjust the other
Adaptive Volume Rendering • Experiment 1
Adaptive Volume Rendering • Experiment 2
Adaptive Volume Rendering • Experiment 3: compare the performance of two implementations • Java-imple • C-imple
Conclusion • Grid computing could be an effective solution for distributed data stream processing • GATES • Distributed processing • Exploit grid web services • Self-adaptation to meet the real-time constraints • Grid resource allocation schemes and dynamic migration