1 / 42

Sociotechnical production systems for software in science

Sociotechnical production systems for software in science. James Howison and Jim Herbsleb. Institute for Software Research School of Computer Science Carnegie Mellon University. School of Information University of Texas at Austin.

oliana
Download Presentation

Sociotechnical production systems for software in science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sociotechnical production systems for software in science James Howison and Jim Herbsleb Institute for Software Research School of Computer Science Carnegie Mellon University School of Information University of Texas at Austin http://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf

  2. How does aa cubic km of ice become a scientific paper?

  3. First find some ice Image Credit: NASA

  4. Build a big drill Image Credit: IceCube

  5. and some Digital Optical Modules Image Credit: IceCube

  6. Combine Image Credit: IceCube

  7. Collect and filter data Image Credit: IceCube

  8. Store and analyze it Image Credit: http://www.flickr.com/photos/theplanetdotcom

  9. Simulate light in ice Photo credit: http://www.flickr.com/photos/rainman_yukky/

  10. Simulate Atmosphere Image Credit: NASA

  11. Model

  12. Analyze

  13. Plots

  14. Publish

  15. Software is everywhere

  16. A appealing vision of software … • Enhancing reproducibility and correctness • Saving money • Driving innovation • Coalescing into widely used software platforms • All linked to software as information artifact: Re-playable Re-useable Extendable

  17. Yet software also has constraints • Maintenance (avoiding “bit rot”) • Software must be maintained (“synchronization work” • Kept in sync with complements and dependencies • Coordinated • Rapid development and changes can lead to breakdown • Path dependencies • Easy to start, hard to architect for widespread use

  18. How to achieve the Software Vision? • Better technologies? • Better engineering methods? • Leadership/Norms/Ethics? • Policy? • Rewards?

  19. A sociotechnical understanding • Understand software work in existing institutions of science • Specific Research Questions: • What software is used? • How created and maintains it? • What incentives drive its creation? • Why is it trusted?

  20. Method: Data • Route into complex practice • Chose paper as unit of analysis: “Focal Paper” • Trace back from paper to work that produced it • Semi-structured interviews • Supported by artifacts (e.g., paper/methods and materials) • Elicit workflow, focus on software work • Identify software authors/sources, and seek introductions • Qualitative analysis • Phenomenological exhaustion

  21. Case 1: STAR Image Credit: RHIC

  22. Our focal paper

  23. Workflow

  24. Software Production • Employed Core Software development • Professional software developers • ROOT4STAR framework • Core simulation code • Scientists undertaking “service work” • Analysis code • “to get the plots” • Locally written, frozen at publication

  25. Case 3: Bioinformatic microbiology Image Credit: http://www.flickr.com/photos/grytr

  26. Studying the nitrogen cycle Image Credit: Focal Paper

  27. A field revolutionized by software

  28. Personal software infrastructure • “Power user scripts” • Personal competitive advantage “that is something that most biologists can’t do. period.” • Share methods but not personal infrastructure code or actively support others • Methods and materials section should provide enough information, if not he’ll fix it. • But not going “to do their homework for them”

  29. “Publishing on” software • Tools potentially useful to others described in separate publications, “Software pubs” • Ambivalence: • Can you make a career out of this? “Definitely” • But: “he’s known for his software rather than his science … he’s known for facilitating science rather than … and some people have that reputation” • Advise a student to do this? • “Yes, but … if you happen to get a publication out of it and it becomes a tool that’s widely used, then great, that’s fantastic, better props for you … but there’s a danger … Tool developers are greatly under-appreciated”

  30. Algorithm people • Self-described member of the “algorithm people” as distinguished from biologists • Muscle: “biology == strcmp()” • Builds from scratch (“avoid tricky dependencies”) • “Obvious” that they don’t collaborate • Credit accrues to the “original publications” • Little credit in perceived incremental improvements • Politics of improvement acceptance “at the mercy of” • Competition is appropriate and productive

  31. Software Production systems Practice that is similar on four aspects: • Incentives for the work • The type of artifacts produced • The way it is organized • The logic of correctness

  32. Context: Academic reputation system

  33. Software as support

  34. Collaboration service-work

  35. Academic credit: Incidental software

  36. Academic credit: Parallel software practice

  37. Systemic threats to software vision • The type of software work needed to realize the cyberinfrastructure vision is poorly motivated • “Invisible work” (Star and Ruhlender) • Especially, little incentive to collaborate • Project “owned” by initial creators • Initial publications receive citations • Extension dominated by fork-and-rename

  38. Academic reputation and integration James Howison and Jim Herbsleb (2013) Sharing the spoils: incentives and integration in scientific software production. ACM CSCW

  39. Where to for science policy? • Exhortations? • Training? • Forcing “open source” through funding lever? • Risk of substituting logics of correctness • “Kleenex” code as open source? • Risk of undermining appropriate competition • Turn scientists into open source community managers? • When there is little reward for this work?

  40. Scientific Software Network Map But, you know, imagine it as a live, dynamic data set!

  41. Techniques for measuring use • Software that reports its own use • Instrumentation • Analysis of traces in papers • Mentions, citations • Characteristic artifacts • Analysis of collections of software • On supercomputing resources (TACC, NICS) • Through workflow systems (Galaxy, Pegasus, Taverna)

  42. Contact James Howison http://james.howison.name jhowison@ischool.utexas.edu This material is based upon work supported by the US National Science Foundation under Grant No. #0943168.

More Related