1 / 28

Documenting and Automating Collateral Evolutions in Linux Device Drivers

Documenting and Automating Collateral Evolutions in Linux Device Drivers. Yoann Padioleau Ecole des Mines de Nantes (now at UIUC) with Julia Lawall and René Rydhof Hansen (DIKU) Gilles Muller (Ecole des Mines de Nantes). the Coccinelle project. foo(1 );. bar(1 );. foo(2 );. bar(2 );.

sugar
Download Presentation

Documenting and Automating Collateral Evolutions in Linux Device Drivers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Documenting and Automating Collateral Evolutions in Linux Device Drivers Yoann Padioleau Ecole des Mines de Nantes (now at UIUC) with Julia Lawall and René Rydhof Hansen (DIKU) Gilles Muller (Ecole des Mines de Nantes) the Coccinelle project

  2. foo(1); bar(1); foo(2); bar(2); The problem: Collateral Evolutions lib.c int foo(int x){ Evolution in a library becomes int bar(intx){ Can entail lots of Collateral Evolutions (CE)in clients before Legend: after clientn.c client1.c client2.c foo(foo(2)); bar(bar(2)); if(foo(3)) { if(bar(3)) {

  3. foo(1); bar(1,?); foo(2); bar(2,?); The problem: Collateral Evolutions lib.c int foo(int x){ Evolution in a library becomes int bar(intx, int y){ Can entail lots of Collateral Evolutions (CE)in clients before Legend: after clientn.c client1.c client2.c foo(foo(2)); bar(bar(2,?),?); if(foo(3)) { if(bar(3,?)) {

  4. Our target: Linux device drivers • Many libraries and many clients: • Lots of driver support libraries: one per device type, one per bus (pci library, sound library, …) • Lots of device specific code: Drivers make up more than 50% of Linux • Many evolutions and collateral evolutions [Eurosys’06] • 1200 evolutions in Linux 2.6 • For each evolution, lots of collateral evolutions • Some collateral evolutions affect over 400 files at over 1000 code sites

  5. Our goal • Currently, Collateral Evolutions in Linux are done nearly manually: • Difficult • Time consuming • Error prone • The highly concurrent and distributed nature of the Linux development process makes it even worse: • Patches that miss code sites (because of newly introduced sites and newly introduced drivers) • Out of date patches, conflicting patches • Drivers outside the Linux source tree are not updated • Misunderstandings Need a tool to document and automate Collateral Evolutions

  6. Taxonomy of transformations • Taxonomy of evolutions (library code): add parameter, split data structure, change protocol sequencing, change return type, add error code, etc • Taxonomy of collateral evolutions (client code)? • Very wide variety of program transformations, affecting wide variety of C and CPP constructs • Often depends on context, e.g.for add argument the newvalue must be constructed from enclosing code • Note that not necesseraly semantic preserving Can not be done by current refactoring tools (more than just renaming entities). Need a flexible tool.

  7. Complex Collateral Evolutions (2.5.71) • Evolution: scsi_get()/scsi_put()dropped from SCSI library • Collateral evolutions: SCSI resource now passed directly to proc_info callback functions via a new parameter int a_proc_info(int x ) { scsi *y; ... y = scsi_get(); if(!y) { ... return -1; } ... scsi_put(y); ... } From local var to parameter ,scsi *y; Delete calls to library Delete error checking code before Legend: after

  8. Our idea int a_proc_info(int x ,scsi *y ) { scsi *y; ... y = scsi_get(); if(!y) { ... return -1; } ... scsi_put(y); ... } The example How to specify the required program transformation ? In what programming language ?

  9. Our idea: Semantic Patches metavariable declarations int a_proc_info(int x + ,scsi *y ) { - scsi *y; ... - y = scsi_get(); - if(!y) { ... return -1; } ... - scsi_put(y); ... } @@ Patch-like syntax functiona_proc_info; identifierx,y; @@ Transform if everything “matches” metavariable references the ‘...’ operator Declarative language modifiers

  10. Affected Linux driver code drivers/scsi/53c700.c drivers/scsi/pcmcia/nsp_cs.c int s53c700_info(int limit) { char *buf; scsi *sc; sc = scsi_get(); if(!sc) { printk(“error”); return -1; } wd7000_setup(sc); PRINTP(“val=%d”, sc->field+limit); scsi_put(sc); return 0; } int nsp_proc_info(int lim) { scsi *host; host = scsi_get(); if(!host) { printk(“nsp_error”); return -1; } SPRINTF(“NINJASCSI=%d”, host->base); scsi_put(host); return 0; } Similar, but not identical

  11. Applying the semantic patch int s53c700_info(int limit) { char *buf; scsi *sc; sc = scsi_get(); if(!sc) { printk(“error”); return -1; } wd7000_setup(sc); PRINTP(“val=%d”, sc->field+limit); scsi_put(sc); return 0; } @@ functiona_proc_info; identifierx,y; @@ int a_proc_info(int x + ,scsi *y ) { - scsi *y; ... - y = scsi_get(); - if(!y) { ... return -1; } ... - scsi_put(y); ... } int nsp_proc_info(int lim) { scsi *host; host = scsi_get(); if(!host) { printk(“nsp_error”); return -1; } SPRINTF(“NINJASCSI=%d”, host->base); scsi_put(host); return 0; } proc_info.sp $ spatch *.c < proc_info.sp

  12. Applying the semantic patch int s53c700_info(int limit, scsi *sc) { char *buf; wd7000_setup(sc); PRINTP(“val=%d”, sc->field+limit); return 0; } @@ functiona_proc_info; identifierx,y; @@ int a_proc_info(int x + ,scsi *y ) { - scsi *y; ... - y = scsi_get(); - if(!y) { ... return -1; } ... - scsi_put(y); ... } int nsp_proc_info(int lim, scsi *host) { SPRINTF(“NINJASCSI=%d”, host->base); return 0; } proc_info.sp $ spatch *.c < proc_info.sp

  13. SmPL: Semantic Patch Language • A single small semantic patch can modify hundreds of files, at thousands of code sites • The features of SmPL make a semantic patch generic. Abstract away irrelevant details: • Differences in spacing, indentation, and comments • Choice of the names given to variables (metavariables) • Irrelevant code (‘...’, control flow oriented) • Other variations in coding style (isomorphisms) e.g. if(!y) ≡ if(y==NULL) ≡ if(NULL==y)

  14. Sequences and the ‘…’ operator C file Semantic patch 1 y = scsi_get(); 2 if(exp) { 3 scsi_put(y); 4 return -1; 5 } 6 printf(“%d”,y->f); 7 scsi_put(y); 8 return 0; - y = scsi_get(); ... - scsi_put(y); Control-flow graph(CFG) of C file 1 path 1: 2 path 2: 6 “. . .” means for all subsequent paths 3 7 4 8 exit One ‘-’ line can erase multiple lines

  15. Isomorphisms, C equivalences • Examples: • Boolean : X == NULL!XNULL == X • Control : if(E)S1 else S2if(!E) S2 else S1 • Pointer : E->field*E.field • etc. • How to specify isomorphisms ? @@expression *X; @@ X == NULL <=> !X<=> NULL == X Reuses SmPL syntax

  16. How does it work ?

  17. The transformation engine architecture Parse Semantic Patch Parse C file Expand isomorphisms Translate to CFG Translate to extended CTL Match CTL against CFG using a model checking algorithm Computational Tree Logic [Clark86] with extra features Modify matched code Unparse

  18. CTL and Model checking • Model checking a CTL formula against a model answers just yes/no (with counter example). • We do program transformations, not just “pattern checking”. Need: • Bind metavariables and remember their value • Remember where we have matched sub-formulas • We have extended CTL : existential variables and program transformation annotations @@expX,Y;@@ f(X); ... - g(Y); + g(X,Y); 9X.f(X);ÆAXA[true U 9v.9Y.g-(-Y-)-;-+g(X,Y);v

  19. Other issues • Need to produce readable code • Keep space, indentation, comments • Keep CPP instructions as-is. Also programmer may want to transform some #define,iterator macros (e.g. list_for_each) • Interactive engine, partial match • Isomorphisms • Rewriting the Semantic patch (not the C code), • Generate disjunctions Very different from most other C tools 60 000 lines of OCaml code

  20. Evaluation

  21. Experiments • Methodology • Detect past collateral evolutions in Linux 2.5 and 2.6 using patchparse tool [Eurosys’06] • Select representative ones • Test suite of over 60 CEs • Study them and write corresponding semantic patches • Note: we are not kernel developers • Going "back to the future". Compare: • what Linux programers did manually • what spatch, given our SPs, does automatically

  22. Test suite • 20 Complex CEs : mistakes done by the programmers • In each case 1-16 errors or misses • 23 Mega CEs : affect over 100 sites • Up to 40 people for up to two years • 26 “typical” CEs: • The whole set of CEs affecting a “typical” (median) directory from 2.6.12 to 2.6.20 More than 5800 driver files

  23. Results • Our SPs are on average 106 lines long • SPs often 100 times smaller than “human-made” patches. A measure of time saved: • Not doing manually the CE on all the drivers • Not reading and reviewing big patches, for people with drivers outside source tree • Overall: correct and complete automated transformation for 93% of files • Problems on the remaining 7%: We miss code sites • CPP issues, lack of isomorphisms (data-flow and inter-procedural) • We are not kernel developers … don’t know how to specify • No false positives, just false negatives • Average processing time of 0.7s per relevant file Sometimes the tool was right and human wrong

  24. Impact on the Linux kernel • We also wrote some SPs for current collateral evolutions (looking at linux kernel mailing lists) • use DIV_ROUND_UP, BUG_ON, FIELD_SIZE • convert kmalloc-memset to kzalloc • Total diffstat: 154 files changed, 203 insertions(+), 375 deletions(-) • We wrote other SPs, for “bug-fixing” (good side effects of our tool) • Add missing put functions (reference counting) • Drop unnecessary put functions (reference counting) • Remove unused variables • Total diffstat: 111 files changed, 340 insertions(+), 355 deletions(-) Accepted in latest Linux kernel

  25. Future work • Are semantic patches and spatch useful • Only for Linux device drivers? • Only for Linux programmers? • Only for collateral evolutions program transformations? • Only for program transformations? • Our thesis: We don’t think so. • But first device driver CEs are an important problem! • All software evolves. Software libraries are more and more important, and have more and more clients We may also help that software, those libraries

  26. Related work • Refactoring • CatchUp[ICSE’05], tool-based, replay refactorings from Eclipse • JunGL[ICSE’06], language-based, but based on ML, less Linux-programmer friendly • Program transformation engines • Stratego[04] • C front-ends • CIL[CC’02]

  27. Conclusion • Collateral Evolution is an important problem, especially in Linux device drivers • SmPL: a declarative language to specify collateral evolutions • Looks like a patch; fits with Linux programmers’ habits • But takes into account the semantics of C hence the name Semantic Patches • A transformation engine to automate collateral evolutions based on model checking technology.

  28. Thank you • You can download our tool, spatch, at http://www.emn.fr/x-info/coccinelle • Questions ?

More Related