10 likes | 183 Views
Task Scheduling and Distribution System Saeed Mahameed , Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology - 2009. Give me Executer. Heart Beat. Schedule. Features Firewall Overcome: A firewall can exist between user machine and remote-machines
E N D
Task Scheduling and Distribution System SaeedMahameed, Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology - 2009 Give me Executer Heart Beat Schedule • Features • Firewall Overcome: • A firewall can exist between user machine and remote-machines • One side connection (user side) • Load Balancing: • Tasks can be submitted to the system asynchronously. Load balancing needed for efficiency: • Distributing tasks considering remote-machine load and priority • Round-Robin based • Prevent starvation • Auto-Update • Support the ability of updating the code provided by the end-user from the System Management UI. • Efficiency (Chunking) • formation of smaller units of information into large coordinated units. • Execution Transparency • Simulating I/O operation and Exceptions in Client local machine. • Fault Tolerance (Failure Detection) • Save the machine state until it connect again, then resumes its work. • Internal Communication Data-flow • Internal Components organization and communication data-flow considering chunks, threads and buffers. • Executer main data-flow diagram: • Collected Tasks in Chunks Scheduled by System-Manager to be sent to the designated Executer. • Client main data-flow diagram: • Received Chunks broken into Tasks, executed by the concrete Executer (user provided) . Results are mapped to the relevant Client in results buffer. • Abstract • Main Problem: Executing large set of small computational tasks consumes numerous processing time on a single machine. • Tasks are homogeneous and non-related • Executing is serial. • Execution order is not significant. • The Main Idea of the project is to build a generic distributed system which supplies a friendly user API allows remote execution. Target audiences of the system are software developers who need to execute tasks over more than one computer without getting involved with complex networking APIs. • The system can be used in environments where some machines are behind firewalls. Additional features are: Auto-update, which seamlessly propagates code updates to all machines in the system. Logger is used for tracing the system and for finding bugs. Remote Exception Handler captures exception occurring on remote-machines and simulates them on the machine that submitted the task. • High Level Design • Main Components: • System manager, Client, Executer, UI. • Task and Result are shared among Executers and Clients. • Executer components resides on a remote machine waiting for tasks or orders from system manager. • UI Component connects to a given System-manager address and allows the user to manage the system. Legend <Component> Collect <Buffer> Thread Client System Tasks Buffer Client main Tasks Receiver Executer System Chunk Creator Chunks Buffer Chunk Buffer Chunk Breaker \ Classifier Executer1 System Manager Results Buffer Chunk Scheduler Tasks Buffer Schedule Client System Manager Executer2 Special Tasks Handler Concrete Executer Task Executer Scheduled Chunks Executer3 Executer Results Buffer Executer3 Send • Communication Design Diagram • The system is built over multiple niches. Each level designed to have its own intra-communication infrastructure and the components communicate using the external-communication infrastructure. Chunks Sender • Why Bother? • Several solution already exist, such as: • Condor • (Complex Syntax, One task per run, Not developer-friendly) • MPI • (Networking understanding needed, • Executing and Synchronizing tasks is the user responsibility ) • Implementing new solution: • User-friendly API (ease of usage). • User transparency. • Dynamic System-Management. • Task generic. • Easy to convert user code from serial to parallel. Result Collector • Performance Analysis • Study on a sample application: • Distributed webpage downloading and parsing • Test results with the following workload: • Task is downloading a web page given its URL • Result is text extracted and parsed from the HTML of the web-page Task Clients Results Organizer Client Firewall Collect Tasks Buffer Tasks Buffer Clients Organized Results Send Task System Manager Client Executer Client Executer UI Get Results Results Buffer Results Buffer Result Task Result System Manager Main Architecture Diagram The Client Schedules a Chunk of Tasks Through the System-Manager that chooses the best Executer for him. The ClientSends the Chunkto be executed to the chosen Executer. The Executerexecutes the Task and returns the Result to him. • Concepts • Task Operation: the function/s the user wants to execute remotely. • Task: invocation of Task Operation with specific arguments. • Result: the component includes the Task Operation’s result on a Task. • System Manager: manages communication among system components such as Networking, Auto-update, Failure detection and so on. • Client: user’s interface, resides in user’s side and responsible for starting/stopping the system, submitting Tasks and processing Results. • Executer: the remote implementing Task Operation. • Conclusion • The more Executers added, the less execution time achieved • Total execution time bounded by the longest task which is close to the optimal.