480 likes | 746 Views
Linux Kernel 2.6.24.3. Processes Yeh Tsung Tai. Outline. Process 0. Processes, Lightweight Processes Process and Thread Process Descriptor Process State Process ID Process Data Structure Process relationship Process Resource Limit Creating Processes Process Switch
E N D
Linux Kernel 2.6.24.3 Processes Yeh Tsung Tai
Outline Process 0 • Processes, Lightweight Processes • Process and Thread • Process Descriptor • Process State • Process ID • Process Data Structure • Process relationship • Process Resource Limit • Creating Processes • Process Switch • Destroying Processes Process 1
Outline • Processes, Lightweight Processes • Process Descriptor • Creating Processes • Process Switch • Destroying Processes
What is a Process? • The concept • An instance of a program in execution • The basic unit of execution in an operating system • Different processes may run several instances of the same program • The purpose • Act as an entity to which system resources are allocated • The Framework • The collection of data structures that describe how far the execution of the program has progressed
Process Concept Command Interpreter User Applications WindowSystem Middleware Operating System Process Process Process Process Instruction Execution & Interrupt Processing I/O Devices Memory
Anatomy of a Process Process State Running, Waiting, Halting, Ready state Process Number Process ID used to identify Process Program Counter Indicate the address of the next instruction to be executed for this processes CPURegister Save the state information when an interrupt occurs Memory Limit Include the page, segment table and the base, limit registers value List of Open Files Process Control Block (PCB)
The Linux Process Descriptor • A structure whose fields contain all the information related to a single process. • /include/linux/sched.h 821~1079
The Linux Process Descriptor Linux 2.6.24 Process Structure 821 struct task_struct { 822 volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */ 823 void *stack; 824 atomic_t usage; 825 unsigned int flags; /* per process flags */837 struct list_head run_list;857 struct list_head tasks;865 struct mm_struct *mm, *active_mm;887 struct task_struct *real_parent; /* real parent process (when being debugged) */ 888 struct task_struct *parent; /* parent process */948 /* CPU-specific state of this task */ 949 struct thread_struct thread; 950 /* filesystem information */ 951 struct fs_struct *fs; 952 /* open file information */ 953 struct files_struct *files; /include/linux/sched.h 821~1079
Process descriptors handling • Linux packs two different data structures in a single per-process memory area. • The length of this memory is 8192 bytes (2 page frames) Esp is the CPU stack pointer, which is used to address the stack’s top location
Introducing Threads • A thread (or lightweight process) is a basic unit of CPU utilization; it consists of: • Program counter • Register set • Stack space • A thread shares with other threads belong to the same process its: • Code section • Data section • Operating-system resource • A traditional or heavyweight process is equal to a task with one thread Process Process Thread 1 Thread 1 Thread 2 Thread 3 Thread 4
Introducing Threads Process’s address space • A process defines the address space that may be shared by multiple threads. • Process Control Block (PCB) contains process-specific information • Owner, PID, heap pointer, priority, active thread, and pointers to thread information • Thread Control Block (TCB) contain thread-specific information • Stack pointer, PC, thread state, register TCB for Thread1 Mapped segments DLL’s PCSPStateRegister Heap Stack- Thread2 TCB for Thread2 Stack- Thread1 Initialized data PCSPStateRegister Code
Thread vs. Processes • Processes • A process has code/data/heap & other segments • There must be at least one thread in a process • There can be more than one thread in a process • Expensive creation • Expensive context switching • If a process dies, its resources are reclaimed & all threads die • Thread • A thread has no data segment or heap • A thread can’t live on its own, it must live within a process • Threads within a process share code/data/heap, share I/O but each has its own stack & registers • Inexpensive creation • Inexpensive context switching • If a thread dies, its stack is reclaimed
The Thread Structure Linux 2.6.24 Thread Information 27 struct thread_info { 28 struct task_struct *task; /* main task structure */ 29 struct exec_domain *exec_domain;/* execution domain */ 30 unsigned long flags;/* low level flags */ 31 unsigned long status;/* thread-synchronous flags */ 32 __u32 cpu;/* current CPU */ 33 int preempt_count; /* 0 => preemptable, <0 => BUG */ 36 mm_segment_t addr_limit;/* thread address space: 37 0-0xBFFFFFFF for user-thead 38 0-0xFFFFFFFF for kernel-thread*/ 40 void *sysenter_return; 41 struct restart_block restart_block; 42 43 unsigned long previous_esp; /* ESP of the previous stack in case 44 of nested (IRQ) stacks*/ 46 __u8 supervisor_stack[0]; 47 }; /include/asm-i386/thread_info.h 27~47
Processes State Created Terminated Main Memory Running Waiting Blocked
Processes State • There are 9 states in the Linux kernel 2.6 • /include/linux/sched.h 167~177 167 #define TASK_RUNNING 0 168 #define TASK_INTERRUPTIBLE 1 169 #define TASK_UNINTERRUPTIBLE 2 170 #define TASK_STOPPED 4 171 #define TASK_TRACED 8 172 /* in tsk->exit_state */ 173 #define EXIT_ZOMBIE 16 174 #define EXIT_DEAD 32 175 /* in tsk->state again */ 176 #define TASK_NONINTERACTIVE 64 177 #define TASK_DEAD 128
Processes State • TASK_RUNNING • The process is either executing on a CPU or waiting to be executed. • TASK_INTERRUPTIBLE • The Process is suspended until some condition becomes true. 453 void signal_wake_up(struct task_struct *t, int resume) 454 { 455 unsigned int mask; 456 457 set_tsk_thread_flag(t, TIF_SIGPENDING); 458 466 mask = TASK_INTERRUPTIBLE; 467 if (resume) 468 mask |= TASK_STOPPED | TASK_TRACED; 469 if (!wake_up_state(t, mask)) 470 kick_process(t); 471 } kernel/signal.c 453~471
Processes State • TASK_UNINTERRUPTIBLE • Deliver a signal to the sleeping process leaves its state unchanged. • Execute when a process must wait until a given event occurs without being interrupted. 568 void fastcall __lock_page(struct page *page) 569 { 570 DEFINE_WAIT_BIT(wait, &page->flags, PG_locked); 571 572 __wait_on_bit_lock(page_waitqueue(page), &wait, sync_page, 573 TASK_UNINTERRUPTIBLE); 574 } __lock_page - get a lock on the page, assuming we need to sleep to get it /mm/filemap.c 568~574
Processes State • TASK_STOPPED • The process has been stopped, when it received one of the SIGSTOP, SIGSTP, SIGTTIN or SIGTTOU signal. • TASK_TRACED • When a process is being monitored by another, process execution has been stopped by a debugger. • EXIT_DEAD 833 if (tsk->exit_signal == -1 && likely(!tsk->ptrace)) 834 state =EXIT_DEAD; kernel/exit.c 833~834
Processes State • EXIT_ZOMBIE • Process execution isterminated, but the parent process has not got the child process termination state. The parent process must issue a wait( )-like system call to return information about the dead process. AParent A wait () B A wait() or exit() A blocks Wait () returns A fork() BChild B exit() B is doing things Normal State Time
Processes State • EXIT_ZOMBIE Zombie State AParent A is doing things without wait() B A wait() or exit() A fork() BChild B exit() B is doing things B is Zombie Time
Process ID • What is the Process ID? • The kernel's internal notion of a process identifier • Process ID refers to individual tasks, process groups, and sessions • The Process ID storage way • The Process ID lives in a hash table • The hash table storage manner refers the process can be found quickly from the numeric pid value.
Identifying a Process • Process ID recycle method • When the kernel reaches the max limit, it must start recycling the lower unused PIDs. • pidmap_array bitmap • Use to manage the condition of PIDs • In 32-bit architectures, the bitmap is stored in a single page. • tgid • All the threads of a multithreaded application share the same identifier (tgid). • tgid is the thread of the PID of the thread group leader. • getpid() tgid pid pid pid pid pid pid Thread group
Identifying a Process • getpid() system call • It returns the value of tgid relative to the current process, so all the threads of a multi-thread application share the same identifier. • Process ID (PID) • It is stored in the pid field of the process descriptor • In 32-bit architectures, the maximum PID number is 32767(PID_MAX_DEFAULT -1) • In 64-bit architectures, the maximum PID number is up to 4194303. 27 #define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000) /include/linux/thread.h
Process ID • Find process in the hash table • Looks for the process having PID nr in the hash table of type namespace. The function returns a process descriptor pointer if a match is found. 290 struct pid * fastcall find_pid_ns (int nr, struct pid_namespace *ns) 291 { 292 struct hlist_node *elem; 293 struct upid *pnr; 294 295 hlist_for_each_entry_rcu(pnr, elem, 296 &pid_hash[pid_hashfn(nr, ns)], pid_chain) 297 if (pnr->nr == nr && pnr->ns == ns) 298 return container_of(pnr, struct pid, 299 numbers[ns->level]); 300 301 return NULL; 302 } /kernel/pid.c
Process ID • Attach process in the hash table • Insert the process descriptor pointed to by task in the PID hash table of type according to the PID number nr. 320 int fastcall attach_pid (struct task_struct *task, enum pid_type type, 321 struct pid *pid) 322 { 323 struct pid_link *link; 324 325 link = &task->pids[type]; 326 link->pid = pid; 327 hlist_add_head_rcu(&link->node, &pid->tasks[type]); 328 329 return 0; 330 } /kernel/pid.c
Doubly linked list • The Linux kernel uses hundreds of doubly linked lists that store the various kernel data structures. • For each list, a set of primitive operations must be implemented. /include/linux/list.h
Doubly linked list • list_add(n,p) • Inserts an element pointed by n right after the specified element pointed by p • List head structure 21 structlist_head { 22 struct list_head *next, *prev; 23 }; 43 static inline void __list_add(struct list_head *new, 44 struct list_head *prev, 45 struct list_head *next) { 47 next->prev = new; 48 new->next = next; 49 new->prev = prev; 50 prev->next = new; } /include/linux/list.h
Doubly linked list • list_del(p) • Deletes an element pointed by p. 155 static inline void __list_del (struct list_head * prev, struct list_head * next) 156 { 157 next->prev = prev; 158 prev->next = next; 159 } 168 static inline void list_del (struct list_head *entry) 169 { 170 __list_del(entry->prev, entry->next); 171 entry->next = LIST_POISON1; 172 entry->prev = LIST_POISON2; 173 } /include/linux/list.h 10 #define LIST_POISON1 ((void *) 0x00100100) 11 #define LIST_POISON2 ((void *) 0x00200200) /include/linux/poison.h
Doubly linked list • list_replace(p) • replace old entry by new one 215 static inline void list_replace (struct list_head *old, 216 struct list_head *new) 217 { 218 new->next = old->next; 219 new->next->prev = new; 220 new->prev = old->prev; 221 new->prev->next = new; 222 } • list_empty(p) • Checks if the list specified by the address of its conventional first element is empty. 298 static inline int list_empty(const struct list_head *head) 299 { 300 return head->next == head; 301 } /include/linux/list.h
Doubly linked list • list_for_each(p,h) • Scans the elements of the list specified by the address h of the conventional first element. 444 #define list_for_each(pos, head) \ 445 for (pos = (head)->next; prefetch(pos->next), pos != (head); \ 446 pos = pos->next) /include/linux/list.h
Doubly linked list • The Linux kernel 2.6 • It is mainly used for hash tables • A linear list, which means only one header rather than two required for the circular list. • Use the hlist can halve the memory consumption for the hash-bucket-array. • Finding the element in the constant time O(1)
Doubly linked list • Hash table list head 700 struct hlist_head { 701 struct hlist_node *first; 702 }; • Hash table list node 704 struct hlist_node { 705 struct hlist_node *next, **pprev; 706 }; 711 static inline void INIT_HLIST_NODE(struct hlist_node *h) 712 { 713 h->next = NULL; 714 h->pprev = NULL; 715 } /include/linux/list.h
Doubly linked list • Hash table list delete 727 static inline void __hlist_del (struct hlist_node *n) 728 { 729 struct hlist_node *next = n->next; 730 struct hlist_node **pprev = n->pprev; 731 *pprev = next; 732 if (next) 733 next->pprev = pprev; 734 } 736 static inline void hlist_del (struct hlist_node *n) 737 { 738 __hlist_del(n); 739 n->next = LIST_POISON1; 740 n->pprev = LIST_POISON2; 741 } /include/linux/list.h
Doubly linked list • Hash table list add 797 static inline void hlist_add_head (struct hlist_node *n, struct hlist_head *h) 798 { 799 struct hlist_node *first = h->first; 800 n->next = first; 801 if (first) 802 first->pprev = &n->next; 803 h->first = n; 804 n->pprev = &h->first; 805 } • Hash table list add with rcu (read-copy-update) 827 static inline void hlist_add_head_rcu (struct hlist_node *n, 828 struct hlist_head *h) 829 { 830 struct hlist_node *first = h->first; 831 n->next = first; 832 n->pprev = &h->first; 833 smp_wmb(); 834 if (first) 835 first->pprev = &n->next; 836 h->first = n; } /include/linux/list.h
RCU (Read-Copy-Update) • RCU is a synchronization mechanism that was add to the Linux Kernel 2.6. • RCU supports concurrency between a single updater and multiple readers. • RCU ensures they are not freed up until all pre-existing read-side critical sections.
Relationships Among Processes • Processes created by a program have a parent/child relationship. • When a process creates multiple children, these children have sibling relationships. 995 struct task_struct *real_parent; 996 struct task_struct *parent; 1001 struct list_head children; 1002 struct list_head sibling; /include/linux/sched.h
Parenthood Relationships P0 P1 P3 P2 Parent P4 Children.next Children.prev Sibling.prev Sibling.next
Process Resource Limits • Each process has an associated set of resource limits. • Process resource limits specify the amount of system resources it can use. • These limit keeps a user from overwhelming the system. • The resource limits for current process are stored in the current->signal->rlim field 42 struct rlimit { 43 unsigned long rlim_cur; 44 unsigned long rlim_max; 45 }; /include/ linux/ resource. c
Process Resource Limits • Rlimit Macro /include/asm-generic/resource.h 15 #define RLIMIT_CPU 0 /* CPU time in sec */ 16 #define RLIMIT_FSIZE 1 /* Maximum filesize */ 17 #define RLIMIT_DATA 2 /* max data size */ 18 #define RLIMIT_STACK 3 /* max stack size */ 19 #define RLIMIT_CORE 4 /* max core file size */ • RLIMIT_STACK Example 42 static inline unsigned long mmap_base(struct mm_struct *mm) 43 { 44 unsigned long gap = current->signal->rlim[RLIMIT_STACK].rlim_cur; 45 unsigned long random_factor = 0; 46 47 if (current->flags & PF_RANDOMIZE) 48 random_factor = get_random_int() % (1024*1024); 49 50 if (gap < MIN_GAP) 51 gap = MIN_GAP; 52 else if (gap > MAX_GAP) 53 gap = MAX_GAP; 54 55 return PAGE_ALIGN(TASK_SIZE - gap - random_factor); 56 } linux/arch/x86_64/ia32/mm/mmap.c
Process Switch • The procedure of suspend the execution of the process running on the CPU and resume the execution of some other process previously suspended. • Process switching occurs only in kernel Mode
Hardware Context • The set of data that must be loaded into the registers before the process resumes its execution • The hardware contexts stay in process descriptor and the kernel mode stack • Minimizing the time spent in saving and loading hardware context is important
Task State Segment • A specific segment type used to store hardware context • The Linux system sets up TSS for each distinct CPU 179 struct tss_struct { 180 u32 reserved1; 181 u64 rsp0; 182 u64 rsp1; 183 u64 rsp2; 184 u64 reserved2; 185 u64 ist[7]; 186 u32 reserved3; 187 u32 reserved4; 188 u16 reserved5; 189 u16 io_bitmap_base; 199 unsigned long io_bitmap[IO_BITMAP_LONGS + 1]; 200 } __attribute__((packed)) ____cacheline_aligned; 320 struct tss_struct { 321 struct i386_hw_tss x86_tss; 329 unsigned long io_bitmap[IO_BITMAP_LONGS + 1]; 333 unsigned long io_bitmap_max; 334 struct thread_struct *io_bitmap_owner; 338 unsigned long __cacheline_filler[35]; 342 unsigned long stack[64]; 343} __attribute__((packed)); /include/asm-x86/processor_32.h /include/asm-x86/processor_64.h
Thread field • At every process switch, the hardware context of the process being replaced saved in thread field. 347 struct thread_struct { 349 struct desc_struct tls_array[GDT_ENTRY_TLS_ENTRIES]; 350 unsigned long esp0; 351 unsigned long sysenter_cs; 352 unsigned long eip; 353 unsigned long esp; 354 unsigned long fs; 355 unsigned long gs; 357 unsigned long debugreg[8]; 359 unsigned long cr2, trap_no, error_code; 361 union i387_union i387; 363 struct vm86_struct __user * vm86_info; 364 unsigned long screen_bitmap; 365 unsigned long v86flags, v86mask, saved_esp0; 366 unsigned int saved_fs, saved_gs; 368 unsigned long *io_bitmap_ptr; 369 unsigned long iopl; 371 unsigned long io_bitmap_max; 372 };
Performing the process switch • Every switch consists of two steps • Switching the Page Global Directory to install a new address space • Switching the Kernel Mode Stack and the hardware context Switch to (A, B, A) Switch to (C, A, C) Process C Process A Process A Process B prev=Cnext=A ProcessStack prev=Anext=B prev=Anext=B prev=Bnext= last C C eax Register A A last
Switch_to macro 19 #define switch_to(prev,next,last) do { 20 unsigned long esi,edi; 21 asm volatile("pushfl\n\t" /* Save flags */ 22 "pushl %%ebp\n\t" 23 "movl %%esp,%0\n\t" /* save ESP */ 24 "movl %5,%%esp\n\t" /* restore ESP */ 25 "movl $1f,%1\n\t" /* save EIP */ 26 "pushl %6\n\t" /* restore EIP */ 27 "jmp __switch_to\n" 28 "1:\t" 29 "popl %%ebp\n\t" 30 "popfl" 31 :"=m" (prev->thread.esp),"=m" (prev->thread.eip), 32 "=a" (last),"=S" (esi),"=D" (edi) \ 33 :"m" (next->thread.esp),"m" (next->thread.eip), 34 "2" (prev), "d" (next)); 35 } while (0) include/asm-x86/system_32.h
The __Switch_to Function • The __switch_to( ) function does the bulk of the process switch started by the switch_to( )macro. 694 struct task_struct fastcall * __switch_to(struct task_struct *prev_p, struct task_struct *next_p) 695 { 696 struct thread_struct *prev = &prev_p->thread, 697 *next = &next_p->thread; 698 int cpu = smp_processor_id(); 699 struct tss_struct *tss = &per_cpu(init_tss, cpu); 703 __unlazy_fpu(prev_p); 707 if (next_p->fpu_counter > 5) 708 prefetch(&next->i387.fxsave); 713 load_esp0(tss, next); 725 savesegment(gs, prev->gs); 730 load_TLS(next, cpu); 738 if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) 739 set_iopl_mask(next->iopl); 755 arch_leave_lazy_cpu_mode(); 770 x86_write_percpu(current_task, next_p); 772 return prev_p; 773 } arch/x86/kernel/process_32.c