160 likes | 171 Views
SimpleR: Taking on the “Evil Empire” to Build Simple R Applications for Non-Statistical Users. Nicholas Lewin-Koh Bert Gunter Genentech Nonclinical Statistics. Outline:. Background and Context: The working environment and needs Strategy: The Approach
E N D
SimpleR: Taking on the “Evil Empire” to Build Simple R Applications for Non-Statistical Users Nicholas Lewin-Koh Bert Gunter Genentech Nonclinical Statistics
Outline: • Background and Context: The working environment and needs • Strategy: The Approach • Example: Tumor Xenograft Study Analysis
Context: Pharmaceutical industry, but regulation is not an issue • We collaborate on many projects that investigate drug efficacy, toxicity, biomarkers, dose determination, manufacturing methods, assay methods, etc. • Data may be complex, so analyses can be tricky. • We need to provide consistent, clear, interpretable analyses to aid scientific assessment • Complex statistical analyses are unsuitable
Measure Tumor Volume Example: Tumor Xenograft Studies • Implant special tumor cell lines in mice, then compare tumor growth under different treatment regimens.
Example: Tumor Xenograft Studies • Xenograft studies help determine which drugs to work on in which cancers, dosing in human studies, biomarkers that can identify subgroups who may or may not benefit, … • Data are challenging, consists of repeated measures of tumor volume over time per animal. • Nonlinear growth/stasis/shrinkage • dropouts due to toxicity or animal care requirements • left censoring when tumors shrink below LOD
DRUG 1 DRUG 2 DRUG 2 DRUG 2 DRUG 3 Ad hoc analyses and plots using Excel are most widely used approaches Poor analyses compromise scientific decision making and our ability to find and develop good drugs. • Realities: • Scientists/engineers usually have neither the background nor time to learn and use sophisticated statistical methods • Wider audience of decision makers cannot consume fancy statistical results anyway • Not nearly enough of us (statisticians) to handle all of this for them (scientists and engineers)
Context for Solutions • Rapid change – in technologies, needs, methods, computer hardware and software… • Need safe and robust methods: reasonable answers quickly in a variety of real circumstances, alert or failure otherwise. • Searching for statistical “optimality” is waste of time. • Communicate all results via graphs and tables. • Users will treat software as “black box” yielding answers. • User interface, not software documentation is key • Developers need to meet rapidly evolving user needs • Rapid prototyping, development, ease of modification, and feature addition are important factors
Try Modify Review/ Test R provides a way to meet these challenges • Many built-in procedures and packages rapid prototyping • Graphics packages (lattice, ggplot, …) ,provide framework for informative, flexible graphical displays • Changes the paradigm ! • Close collaboration with customers during development:
Strategy • Initially, Windows desktop application on only very few (1 or 2) desktops • Simple menu interface automatically starts up when user clicks on R icon. • e.g. Use startup options to read in .RData file with all functions and execute code that sets up menus, etc. • We do it with .Rprofile file, but many alternatives are available • Once customers are satisfied and code has stabilized, port to Web-based interface to ease maintenance for larger user base • So far, we haven’t found the extra overhead for converting to packages worthwhile, but this may change. • Remember, for users it’s a black box that provides solutions, not a tool.
Output: Model fit XXXXXXXX
Web Interface XXXXX
Summary: • Excel is ubiquitous data analysis software, so opportunities for major improvements abound. • To replace it, we need: • rapid development of flexible, robust solutions • “intelligent” graphs and tables to communicate results • Workable user interfaces that shield users from technical details • A way to scale solutions, that does not require a large ongoing effort to support • R and its supporting packages meet these needs.
Thanks: Translational Oncology Bruno Alicke Steven Gould Bioinformatics Dana Caulder Vivek Ramaswamy Kathryn Woods