480 likes | 572 Views
Basic reading, writing and informatics skills for biomedical research. Segment 6. Experimental design. Introduction . What is Experimental Design ? Statistics? “The organization of an experiment to allow effective testing of the research hypothesis .”
E N D
Basic reading, writing and informatics skills for biomedical research Segment 6. Experimental design Ganesha Associates
Introduction • What is Experimental Design ? • Statistics? • “The organization of an experiment to allow effective testing of the research hypothesis.” • “The design of any information-gathering exercises where variation is present, whether under the full control of the experimenter or not.” Ganesha Associates
Some myths about ED • Myth 1 • Its better to spend time collecting data than sitting around thinking about collecting data, just get into the lab/field and start making measurements! • Reality • No! A well-designed experiment will save you a lot of time by eliminating unnecessary repetition and improving the precision of your measurements • Hint: analyse data as you generate it – quality control, test assumptions Ganesha Associates
Some myths about ED • Myth 2 • “It doesn’t matter how you collect your data, there will always be a statistical ‘fix’ that will allow you to analyse them” • Reality • No! Common problems are non-independence of data, lack of prior knowledge of the likely variance of parameters being measured, and absence of appropriate control or reference groups. Ganesha Associates
Some myths about ED • Myth 3 • “If you collect lots of data something interesting will come out and you will be able to detect even very subtle effects” • Reality • No! Generally collecting lots of data without a well-formulated hypothesis wastes your time and someone else’s money. • Remember that if you analyse a data set many different ways, you are bound to discover effects that have a p-value of less than 0.05. Ganesha Associates
Start with the project proposal – quick check list • Why is the problem under study of importance • Economic, medical significance ? • What are the underlying key issues of basic scientific significance • Establish strong links to the consensus view ? • How is the problem to be addressed experimentally ? • Has an appropriate model system been chosen ? • What information needs to be collected ? • Which methods have been chosen for this purpose and why ? • Limitations • Have the most-likely reasons for failure been identified ? • What is the ‘Fail early’ strategy ? • Literature review • Is it up-to-date ? • Are all key points of logical development in the text backed by an appropriate reference ? Ganesha Associates
The anatomy of an experiment Ganesha Associates
The anatomy of an omics experiment • A sample of cells (≈ obs units), from an organism (≈ exp units), with a history, on which the measurements are to be compared in one or more possible states, e.g. genotype, disease, chemical treatment (≈ treatments) • to which a lot of delicate wet lab preliminaries - extraction, amplification, labelling are applied • leading to the analytebeing submitted to a complex piece of equipment, on which lots (10s - 1,000,000s) of measurements are made. Ganesha Associates
The anatomy of an omics experiment • The measurements can be fluorescence intensities or counts, often highly pre-processed, with transformation, normalization and summarization usually being needed before analysis. • There is rarely a single objective, and frequently no clearly stated goal; often it’s a screen ≈ fishing expedition, e.g. a good outcome can be finding ≥1, many, if not all genes or proteins or metabolites satisfying some condition. Ganesha Associates
Key factors to think about • Baseline assumptions • Sources of variance • Need for technical, biological replication • Sample size, statistical power and significance • Choice of controls • Non-independence of data • Confounding variables • Randomization, stratification • Bias, blinding • Multiple testing, data mining • Inferring causality Ganesha Associates
Collecting data – keep a notebook Ganesha Associates
Collecting data – make a spreadsheet Ganesha Associates
Collecting data – check key assumptions Ganesha Associates
Publication doesn’t guarantee that the design is correct! Ganesha Associates
“Why Most Published Research Findings Are False”John Ioannidis, PLoS Medicine, 2005 • Small studies are less likely the research findings are to be true. • Small effect sizes are less likely to be true. • The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true. • The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true. • The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true. Ganesha Associates
How citation distortions create unfounded authoritySteven Greenberg, BMJ, 2009 Many published biomedical belief systems are built on sound data, with authors repeating claims after trusting the published expert opinion of their colleagues. However, there are incentives for generating and joining information cascades regardless of their soundness. Joining an information cascade aids publication as articles have to say something and negative results are biased against. Generating and joining an information cascade may improve the likelihood of obtaining research funding because hypothesis driven research is an essential requirement at many research funding agencies and successful funding generally requires a “strong hypothesis . . . Ganesha Associates
In life sciences there are many unknowns “As we know, There are known knowns. There are things we know we know. We also know There are known unknowns. That is to say We know there are some things We do not know. But there are also unknown unknowns, The ones we don't know We don't know.” Donald Rumsfeldt, US Secretary of Defense (sic) Feb. 12, 2002, Department of Defense news briefing from "The Poetry of D.H. Rumsfeldt" http://slate.msn.com/id/2081042/ Ganesha Associates
Presenting your ideas • Create a slide show that is an outline, not a script • Use the slide show... • to select important information and visuals • to organize content • to create a hierarchy • Many of the subsequent slides were adapted from work done by the Cain Project in Engineering & Professional Communication • www.owlnet.rice.edu/~cainproj Ganesha Associates
Selecting Content • Consider your audience – not everyone will have your knowledge of the problem! • State problem/question clearly, early and repeat (in the title, in the introduction) • Explain the significance, context • Include background: organism/system/model • State the point of departure for work precisely Ganesha Associates
Displaying Text • Remember that your audience... • skims each slide • looks for critical points, not details • needs help reading/ seeing text • So keep to an outline only • Help your audience by… • Projecting a clear font • Using bullets • Using content-specific headings • Using short phrases • Using grammatical parallelism Ganesha Associates
Project a clear font • Serif: easy to read in printed documents • Times New Roman, Palatino, Garamond • Sans serif: easy to see projected across the room • Arial, Helvetica, Geneva Ganesha Associates
Use bullets – but not too many • Bullets help your audience • to skim the slide • to see relationships between information • organize information in a logical way • For example, this is Main Point 1, which leads to... • Sub-point 1 • Further subordinated point 1 • Further subordinated point 2 • Sub-point 2 Ganesha Associates
Use content-specific headings • “Results”suggests the content area for a slide • “Substance X up-regulates gene Y” (with data shown below) shows the audience what is observed Ganesha Associates
Difficult to read DNA polymerase catalyzes elongation of DNA chains in the 5’ to 3’ direction Better DNA polymerase extends 5’ to 3’ Use short phrases • Be clear, concise, accurate • Write complete sentences only in certain cases: • Hypothesis / problem statement • Quote • ??? Ganesha Associates
Use grammatical parallelism • Use same grammatical form in lists • Not Parallel: • Lyse cells in buffer • 5 minute centrifuging of lysate • Supernatant is removed • Parallel: • Lysed cells in buffer • Centrifuged lysate for 5 minutes • Removed supernatant Ganesha Associates
Use grammatical parallelism How would you revise this list? Telomeres • Contain non-coding DNA • Telomerases can extend telomeres • Cells enter senescence/apoptosis when telomeres are too short Ganesha Associates
Use grammatical parallelism One possible revision… Telomeres • Contain non-coding DNA • Are extended by telomerase • Cause senescence/apoptosis when shortened too much Ganesha Associates
Displaying visuals • Select visuals that enhance understanding • Figures from your work: evidence for argument • Figures from other sources (web; review articles): • Model a process or concept • Help explain background, context • Design easy-to-read visuals • Are the visuals easy to read by all members of your audience? • Draw attention to aspects of visuals Ganesha Associates
Simplify and draw attention Ganesha Associates http://www.indstate.edu/thcme/mwking/tca-cycle.html
Cite others’ visuals Harvey et al. (2005) Cell 122:407-20 http://www.bioc.rice.edu/~shamoo/shamoolab.html Ganesha Associates
Samples Features to consider: • Text • Fonts, use of phrases, parallelism • Visuals • Readability, drawing attention • Slide design • Organization/ hierarchy • Titles, bullets, arrangement of information, font size Ganesha Associates
The Calcium Ion Calcium is a crucial cell-signaling molecule • Calcium is toxic at high intracellular concentrations because of the phosphate-based system energy system • Intracellular concentrations of calcium are kept very low, which allows an influx of calcium to be a signal to alter transcription Ganesha Associates
Microarrays Ganesha Associates Phillips G. (2004) Iowa State University College of Veterinary Medicine.
Presenting • Delivery • Handling questions Ganesha Associates
Delivery • Physical Environment • Stance • Body language • Handling notes • Gestures • Eye contact • Voice quality • Volume • Inflection • Pace Ganesha Associates
Handling Questions • LISTEN • Repeat or rephrase • Watch body language • Don’t pretend to know Ganesha Associates
Practical activity 6a - Developing and presenting your project • Total duration - ca. 2 hours. • Identify the five most important research articles that frame your hypothesis, i.e. the fundamental facts and assumptions upon which your idea is based. • Describe the basis for your hypothesis in a paragraph of no more than seven sentences. • Read the article by Peter Norvig on experimental design. (For Firefox users the alternative URL is here.) • What alternative experimental approaches are available to answer your question ? • How do you intend to verify your hypothesis? • Identify and justify the journal you want to publish the results of your research in. • Give a 5-slide presentation to justify your choices at the next session. Ganesha Associates
Practical activity 6b - Thinking about probability and statistics • Total duration - ca. 3 hours. • First read the series of articles published recently by Wai-Ching Leung in the British Medical Journal. Although intended for a medical audience, these article provide the basis for a useful primer for all most fields of biomedical research. The articles are: • Why and when do we need medical statistics • Measuring chances • Summarising information • Testing hypotheses • Now answer the following questions: • I have a plant extract which I believe has an effect on blood pressure. I measure its effects by injecting the substance into rats and measuring their blood pressure before and after the injection. The statistical test I use tells me that the probability of collecting this sample of results is less than 0.05. What does this mean ? • 1% of women aged forty who participate in routine mamography screening have breast cancer. 80% of the women with breast cancer get a positive result. 9.6% of women without breast cancer will also get a positive result. So, if a woman from this group gets a positive result, what is the probablity that she has breast cancer ? • In the UK, car registration plates can typically consist of a string of 6 or 7 alphanumeric characters (A, B, C, etc, 1, 2, 3 etc). So the probability of a specific sequence of characters (e.g. DB1979) is less than 1 in 2 billion. I send a small group of people out into a car park and ask them to look for a registration plate that has personal significance for them. What is the likelihood of this happening ? • A friend of mine has consistently predicted the results of 5 of the football matches leading to today's final. He is offering to sell me his prediction for the final match so that I can place a bet and make some money. What are the odds that he will predict the outcome of the last match correctly ? • A murder is committed. Traces of your fingerprints are found on the murder weapon. What is the probability that you are guilty ? Ganesha Associates
Practical activity 6c - Presenting data • Total duration - ca. 1 hour. • Read Mary Purugganan's presentation about data visualisation. Identify some examples of illustrations used in recent primary research papers which illustrate some of the points she makes. Ganesha Associates