1 / 35

Processes and Threads

Processes and Threads. Processes and their scheduling Multiprocessor scheduling Threads Distributed Scheduling/migration. Processes: Review. Multiprogramming versus multiprocessing Kernel data structure: process control block (PCB) Each process has an address space

whitley
Download Presentation

Processes and Threads

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processes and Threads • Processes and their scheduling • Multiprocessor scheduling • Threads • Distributed Scheduling/migration CS677: Distributed OS

  2. Processes: Review • Multiprogramming versus multiprocessing • Kernel data structure: process control block (PCB) • Each process has an address space • Contains code, global and local variables.. • Process state transitions • Uniprocessor scheduling algorithms • Round-robin, shortest job first, FIFO, lottery scheduling, EDF • Performance metrics: throughput, CPU utilization, turnaround time, response time, fairness CS677: Distributed OS

  3. Process Scheduling • Priority queues: multiples queues, each with a different priority • Use strict priority scheduling • Example: page swapper, kernel tasks, real-time tasks, user tasks • Multi-level feedback queue • Multiple queues with priority • Processes dynamically move from one queue to another • Depending on priority/CPU characteristics • Gives higher priority to I/O bound or interactive tasks • Lower priority to CPU bound tasks • Round robin at each level CS677: Distributed OS

  4. Processes and Threads • Traditional process • One thread of control through a large, potentially sparse address space • Address space may be shared with other processes (shared mem) • Collection of systems resources (files, semaphores) • Thread (light weight process) • A flow of control through an address space • Each address space can have multiple concurrent control flows • Each thread has access to entire address space • Potentially parallel execution, minimal state (low overheads) • May need synchronization to control access to shared variables CS677: Distributed OS

  5. Threads • Each thread has its own stack, PC, registers • Share address space, files,… CS677: Distributed OS

  6. Why use Threads? • Large multiprocessors need many computing entities (one per CPU) • Switching between processes incurs high overhead • With threads, an application can avoid per-process overheads • Thread creation, deletion, switching cheaper than processes • Threads have full access to address space (easy sharing) • Threads can execute in parallel on multiprocessors CS677: Distributed OS

  7. Why Threads? • Single threaded process: blocking system calls, no parallelism • Finite-state machine [event-based]: non-blocking with parallelism • Multi-threaded process: blocking system calls with parallelism • Threads retain the idea of sequential processes with blocking system calls, and yet achieve parallelism • Software engineering perspective • Applications are easier to structure as a collection of threads • Each thread performs several [mostly independent] tasks CS677: Distributed OS

  8. Multi-threaded Clients Example : Web Browsers • Browsers such as IE are multi-threaded • Such browsers can display data before entire document is downloaded: performs multiple simultaneous tasks • Fetch main HTML page, activate separate threads for other parts • Each thread sets up a separate connection with the server • Uses blocking calls • Each part (gif image) fetched separately and in parallel • Advantage: connections can be setup to different sources • Ad server, image server, web server… CS677: Distributed OS

  9. Multi-threaded Server Example • Apache web server: pool of pre-spawned worker threads • Dispatcher thread waits for requests • For each request, choose an idle worker thread • Worker thread uses blocking system calls to service web request CS677: Distributed OS

  10. Thread Management • Creation and deletion of threads • Static versus dynamic • Critical sections • Synchronization primitives: blocking, spin-lock (busy-wait) • Condition variables • Global thread variables • Kernel versus user-level threads CS677: Distributed OS

  11. User-level versus kernel threads • Key issues: • Cost of thread management • More efficient in user space • Ease of scheduling • Flexibility: many parallel programming models and schedulers • Process blocking – a potential problem CS677: Distributed OS

  12. User-level Threads • Threads managed by a threads library • Kernel is unaware of presence of threads • Advantages: • No kernel modifications needed to support threads • Efficient: creation/deletion/switches don’t need system calls • Flexibility in scheduling: library can use different scheduling algorithms, can be application dependent • Disadvantages • Need to avoid blocking system calls [all threads block] • Threads compete for one another • Does not take advantage of multiprocessors [no real parallelism] CS677: Distributed OS

  13. User-level threads CS677: Distributed OS

  14. Kernel-level threads • Kernel aware of the presence of threads • Better scheduling decisions, more expensive • Better for multiprocessors, more overheads for uniprocessors CS677: Distributed OS

  15. Process Migration • Transfer of sufficient amount of the state of a process from one machine to another • The process executes on the target machine

  16. Motivation • Load sharing • Move processes from heavily loaded to lightly load systems • Load can be balanced to improve overall performance • Communications performance • Processes that interact intensively can be moved to the same node to reduce communications cost • May be better to move process to where the data reside when the data is large

  17. Motivation • Availability • Long-running process may need to move because the machine it is running on will be down • Utilizing special capabilities • Process can take advantage of unique hardware or software capabilities

  18. Initiation of Migration • Operating system • When goal is load balancing • Process • When goal is to reach a particular resource

  19. What is Migrated? • Must destroy the process on the source system and create it on the target system • Process control block and any links must be moved

  20. What is Migrated? • Eager (all):Transfer entire address space • No trace of process is left behind • If address space is large and if the process does not need most of it, then this approach my be unnecessarily expensive

  21. What is Migrated? • Precopy: Process continues to execute on the source node while the address space is copied • Pages modified on the source during precopy operation have to be copied a second time • Reduces the time that a process is frozen and cannot execute during migration

  22. What is Migrated? • Eager (dirty): Transfer only that portion of the address space that is in main memory and have been modified • Any additional blocks of the virtual address space are transferred on demand • The source machine is involved throughout the life of the process

  23. What is Migrated? • Copy-on-reference: Pages are only brought over on reference • Variation of eager (dirty) • Has lowest initial cost of process migration

  24. What is Migrated? • Flushing: Pages are cleared from main memory by flushing dirty pages to disk • Relieves the source of holding any pages of the migrated process in main memory

  25. Negotiation of Migration • Migration policy is responsibility of Starter utility • Starter utility is also responsible for long-term scheduling and memory allocation • Decision to migrate must be reached jointly by two Starter processes (one on the source and one on the destination)

  26. Eviction • System evict a process that has been migrated to it • If a workstation is idle, process may have been migrated to it • Once the workstation is active, it may be necessary to evict the migrated processes to provide adequate response time

  27. Distributed Scheduling: Motivation • Distributed system with N workstations • Model each w/s as identical, independent M/M/1 systems • Utilization u, P(system idle)=1-u • What is the probability that at least one system is idle and one job is waiting? CS677: Distributed OS

  28. Implications • Probability high for moderate system utilization • Potential for performance improvement via load distribution • High utilization => little benefit • Low utilization => rarely job waiting • Distributed scheduling (aka load balancing) potentially useful • What is the performance metric? • Mean response time • What is the measure of load? • Must be easy to measure • Must reflect performance improvement CS677: Distributed OS

  29. Components • Transfer policy: when to transfer a process? • Threshold-based policies are common and easy • Selection policy: which process to transfer? • Prefer new processes • Transfer cost should be small compared to execution cost • Select processes with long execution times • Location policy: where to transfer the process? • Polling, random, nearest neighbor • Information policy: when and from where? • Demand driven [only if sender/receiver], time-driven [periodic], state-change-driven [send update if load changes] CS677: Distributed OS

  30. Sender-initiated Policy • Transfer policy • Selection policy: newly arrived process • Location policy: three variations • Random: may generate lots of transfers => limit max transfers • Threshold: probe n nodes sequentially • Transfer to first node below threshold, if none, keep job • Shortest: poll Np nodes in parallel • Choose least loaded node below T CS677: Distributed OS

  31. Receiver-initiated Policy • Transfer policy: If departing process causes load < T, find a process from elsewhere • Selection policy: newly arrived or partially executed process • Location policy: • Threshold: probe up to Np other nodes sequentially • Transfer from first one above threshold, if none, do nothing • Shortest: poll n nodes in parallel, choose node with heaviest load above T CS677: Distributed OS

  32. Symmetric Policies • Nodes act as both senders and receivers: combine previous two policies without change • Use average load as threshold • Improved symmetric policy: exploit polling information • Two thresholds: LT, UT, LT <= UT • Maintain sender, receiver and OK nodes using polling info • Sender: poll first node on receiver list … • Receiver: poll first node on sender list … CS677: Distributed OS

More Related