1 / 51

Managing Program Complexity Through Modularization in UNIX Systems

Learn strategies to reduce program complexity in UNIX systems by breaking functionality into modules and limiting module interactions. Explore stack management, recursion, and enforced modularity techniques. Dive into system interactivity and communication via file interfaces in the UNIX philosophy.

dariusd
Download Presentation

Managing Program Complexity Through Modularization in UNIX Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UNIX! Landon Cox September 3, 2012

  2. Dealing with complexity • How do youreduce the complexity of large programs? • Break functionality into modules • Goal is to “decouple” unrelated functions • Narrow the set of interactions between modules • Hope to make whole system easier to reason about • How do we specify interactions between code modules? • Procedure calls (or objects = data + procedure calls) • intfoo(char *buf) • Procedure calls reduce complexity by • Limiting how modules can interact with one another • Hiding implementation details

  3. Dealing with complexity intmain () { getInput (); computeResult (); printOutput (); } intmain () { cout << “input: ”; cin >> input; output = sqrt (input); output = pow (output,3); cout << output << endl; } void getInput() { cout << “input: ”; cin >> input; } void printOutput() { cout << output << endl; } void computeResult() { output = sqrt (input); output = pow (output,3); }

  4. intP(int a){…}void C(intx){ inty=P(x);} How do C and P share information? • Via a shared, in-memory stack

  5. intP(int a){…}void C(intx){ inty=P(x);} What info is stored on the stack? • C’s registers, call arguments, RA, • P's local vars

  6. Review of the stack • Each stack frame contains a function’s • Local variables • Parameters • Return address • Saved values of calling function’s registers • The stack enables recursion

  7. Code Memory Stack SP void C () { A (0); } void B () { C (); } void A (inttmp){ if (tmp) B (); } int main () { A (1); return 0; } 0xfffffff tmp=0 RA=0x8048347 A SP 0x8048347 const=0 RA=0x8048354 C SP 0x8048354 RA=0x8048361 B SP … tmp=1 RA=0x804838c A 0x8048361 SP main const1=1 const2=0 0x804838c 0x0

  8. Code Memory Stack SP 0xfffffff bnd=0 RA=0x8048361 A SP bnd=1 RA=0x8048361 A void A (intbnd){ if (bnd) A (bnd-1); } int main () { A (3); return 0; } SP 0x8048361 bnd=2 RA=0x8048361 A SP … … A bnd=3 RA=0x804838c 0x804838c SP main const1=3 const2=0 How can recursion go wrong? Can overflow the stack … Keep adding frame after frame 0x0

  9. Code Memory Stack void cap (char* b){ for (inti=0; b[i]!=‘\0’; i++) b[i]+=32; } intmain(char*arg) { char wrd[4]; strcpy(arg, wrd); cap (wrd); return 0; } 0xfffffff 0x8048361 SP … … SP cap b=0x00234 RA=0x804838c 0x804838c main wrd[3] wrd[2] wrd[1] wrd[0] const2=0 What can go wrong? Can overflow wrd variable … Overwrite cap’s RA 0x00234 0x0

  10. intP(int a){…}void C(intx){ inty=P(x);} Can think of this as a contract • P agrees to return • P agrees to resume where C left off • P agrees to restore the stack pointer • P agrees to leave rest of stack alone

  11. intP(int a){…}void C(intx){ inty=P(x);} Is the call contract enforced? • At a low level, NO! • P can violate all terms of the contract • Sources of violations: attacks + bugs

  12. intP(int a){…}void C(intx){ inty=P(x);} Enforcing the contract is feasible • Interaction is purely mechanical • Programmers intention is clear • No semantic gap to cross

  13. intP(int a){…}void C(intx){ inty=P(x);} How does Java enforce the call contract? • Language restricts expressiveness • Programmers can’t access the stack • Special “invoke” instruction expresses intent • JVM trusted to transfer control between C, P

  14. intP(int a){…}void C(intx){ inty=P(x);} Awesome, so why not run only Java programs? • Lower-level languages are faster • (trusted JVM interposes on every instr) • Restricts programmer’s choice • (maybe, I hate programming in Java)

  15. intP(int a){…}void C(intx){ inty=P(x);} Another approach to enforced modularity • Put C and P in separate processes • Code is fast when processes not interacting • Trust kernel to handle control transfers • Kernel ensures transitions are correct

  16. intP(int a){…}void C(intx){ inty=P(x);} Key question: What should the interface be? • Put C and P in separate processes • Want a general interface for inter-process communication (IPC) • Should be simple and powerful (i.e., elegant)

  17. UNIX philosophy • OS by programmers for programmers • Support high-level languages (C and scripting) • Make interactivity a first-order concern (via shell) • Allow rapid prototyping • How should you program for a UNIX system? • Write programs with limited features • Do one thing and do it well • Support easy composition of programs • Make data easy to understand • Store data in plaintext (not binary formats) • Communicate via text streams Thompson and Ritchie Turing Award ‘83

  18. UNIX philosophy Kernel ProcessC ? ProcessP What is the core abstraction? • Communication via files

  19. UNIX philosophy Kernel File ProcessC ProcessP What is the interface? • Open: get a file reference (descriptor) • Read/Write: get/put data • Close: stop communicating

  20. UNIX philosophy Kernel File ProcessC ProcessP Why is this safer than procedure calls? • Interface is narrower • Access file in a few well-defined ways • Kernel ensures things run smoothly

  21. UNIX philosophy Kernel File ProcessC ProcessP How do we transfer control to kernel? • Special system call instruction! • CPU pauses process, runs kernel • Kind of like Java’s invoke instruction

  22. UNIX philosophy Kernel File ProcessC ProcessP Key insight: • Interface can be used for lots of things • Persistent storage (i.e., “real” files) • Devices, temporary channels (i.e., pipes)

  23. UNIX philosophy Kernel File ProcessC ProcessP Two questions • How do processes start running? • How do we control access to files?

  24. Course administration • Heap manager project • Due a week from Friday • Sorry, but I can’t help you … • Questions for Vamsi? • Piazza • Should have received account info • Email Jeff if not • Other questions?

  25. UNIX philosophy Kernel File ProcessC ProcessP Two questions • How do processes start running? • How do we control access to files?

  26. UNIX philosophy Kernel File ProcessC ProcessP Two questions • How do processes start running?

  27. UNIX philosophy Kernel File ProcessC ProcessP Maybe P is already running? • Could just rely on kernel to start processes

  28. UNIX philosophy Kernel File ProcessC ProcessP What might we call such a process? • Basically what a server is • A process C wants to talk to that someone else launched

  29. UNIX philosophy Kernel File ProcessC ProcessP All processes shouldn’t be servers • Want to launch processes on demand • C needs primitives to create P

  30. UNIX Shell Kernel Shell Program that runs other programs • Interactive (accepts user commands) • Essentially just a line interpreter • Allows easy composition of programs

  31. UNIX shell • How does a UNIX process interact with a user? • Via standard in (fd 0) and standard out (fd 1) • These are the default input and output for a program • Establishes well-known data entry and exit points for a program • How do UNIX processes communicate with each other? • Mostly communicate with each other via pipes • Pipes allow programs to be chained together • Shell and OS can connect one process’s stdout to another’s stdin • Why do we need pipes when we have files? • Pipes create unnamed temporary buffers between processes • Communication between programs is often ephemeral • OS knows to garbage collect resources associated with pipe on exit • Consistent with UNIX philosophy of simplifying programmers’ lives

  32. UNIX shell • Pipes simplify naming • Program always receives input on fd 0 • Program always emits output on fd 1 • Program doesn’t care what is on the other end of fd • Shell/OS handle input/output connections • How do pipes simplify synchronization? • Pipe accessed via read system call • Read can block in kernel until data is ready • Or can poll, checking to see if read returns enough data

  33. How kernel starts a process • Allocates process control block (bookkeeping data structure) • Reads program code from disk • Stores program code in memory (could be demand-loaded too) • Initializes machine registers for new process • Initializes translator data for new address space • E.g., page table and PTBR • Virtual addresses of code segment point to correct physical locations • Sets processor mode bit to “user” • Jumps to start of program Need hardware support

  34. Creating processes • Through what commands does UNIX create processes? • Fork: create copy child process • Exec: initialize address space with new program • What’s the problem of creating an exact copy process? • Child needs to do something different than parent • i.e., child needs to know that it is the child • How does child know it is child? • Pass in return point • Parent returns from fork call, child jumps into other region of code • Fork works slightly differently now

  35. Fork • Child can’t be an exact copy • Is distinguished by one variable (the return value of fork) if (fork () == 0) { /* child */ execute new program } else { /* parent */ carry on }

  36. Creating processes • Why make a complete copy of parent? • Sometimes you want a copy of the parent • Separating fork/exec provides flexibility • Allows child to inherit some kernel state • E.g., open files, stdin, stdout • Very useful for shell • How do we efficiently copy an address space? • Use “copy on write” • Make copy of page table, set pages to read-only • Only make physical copies of pages on write fault

  37. Copy on write Physical memory Parent memory Child memory What happens if parent writes to a page?

  38. Copy on write Physical memory Parent memory Child memory Have to create a copy of pre-write page for the child.

  39. Alternative approach • Windows CreateProcess • Combines the work of fork and exec • UNIX’s approach • Supports arbitrary sharing between parent and child • Window’s approach • Supports sharing of most common data via params

  40. Shells (bash, explorer, finder) • Shells are normal programs • Though they look like part of the OS • How would you write one? while (1) { print prompt (“crocus% “) ask for input (cin) // e.g., “ls /tmp” first word of input is command // e.g., ls fork a copy of the current process (shell) if (child) { redirect output to a file if requested (or a pipe) exec new program (e.g., with argument “/tmp”) } else { wait for child to finish or can run child in background and ask for another command } }

  41. Shell demo

  42. UNIX philosophy Kernel File ProcessC ProcessP Two questions • How do processes start running? • How do we control access to files?

  43. UNIX philosophy Kernel File ProcessC ProcessP Two questions • How do processes start running? • How do we control access to files?

  44. Access control • Where is most trusted code located? • In the operating system kernel • What are the primary responsibilities of a UNIX kernel? • Managing the file system • Launching/scheduling processes • Managing memory • How do processes invoke the kernel? • Via system calls • Hardware shepherds transition from user process to kernel • Processor knows when it is running kernel code • Represents this through protection rings or mode bit

  45. Access control • How does kernel know if system call is allowed? • Looks at user id (uid) of process making the call • Looks at resources accessed by call (e.g., file or pipe) • Checks access-control policy associated with resource • Decides if policy allows uid to access resources • How is a uid normally assigned to a process? • On fork, child inherits parent’s uid

  46. MOO accounting problem • Multi-player game called Moo • Want to maintain high score in a file • Should players be able to update score? • Yes • Do we trust users to write file directly? • No, they could lie about their score Game client (uidx) “x’s score = 10” High score “y’s score = 11” Game client (uidy)

  47. MOO accounting problem • Multi-player game called Moo • Want to maintain high score in a file • Could have a trusted process update scores • Is this good enough? Game client (uidx) “x’s score = 10” Game server High score “x:10 y:11” “y’s score = 11” Game client (uidy)

  48. MOO accounting problem • Multi-player game called Moo • Want to maintain high score in a file • Could have a trusted process update scores • Is this good enough? • Can’t be sure that reported score is genuine • Need toensure score was computed correctly Game client (uidx) “x’s score = 100” Game server High score “x:100 y:11” “y’s score = 11” Game client (uidy)

  49. Access control • Insight: sometimes simple inheritance of uids is insufficient • Tasks involving management of “user id” state • Logging in (login) • Changing passwords (passwd) • Why isn’t this code just inside the kernel? • This functionality doesn’t really require interaction w/ hardware • Would like to keep kernel as small as possible • How are “trusted” user-space processes identified? • Run as super user or root (uid 0) • Like a software kernel mode • If a process runs under uid 0, then it has more privileges

  50. Access control • Why does login need to run as root? • Needs to check username/password correctness • Needs to fork/exec process under another uid • Why does passwd need to run as root? • Needs to modify password database (file) • Database is shared by all users • What makes passwd particularly tricky? • Easy to allow process to shed privileges (e.g., login) • passwd requires an escalation of privileges • How does UNIX handle this? • Executable files can have their setuid bit set • If setuid bit is set, process inherits uid of image file’s owner on exec

More Related