1 / 51

Research Integrity Scientific Misconduct Scientific rigor

Research Integrity Scientific Misconduct Scientific rigor. Oswald Steward Ph.D. Responsible Conduct of Research (RCR ). Public concern surfaced in the early 1980’s following reports of egregious misbehavior.

lael
Download Presentation

Research Integrity Scientific Misconduct Scientific rigor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Integrity Scientific MisconductScientific rigor Oswald Steward Ph.D.

  2. Responsible Conduct of Research (RCR) • Public concern surfaced in the early 1980’s following reports of egregious misbehavior. • One researcher republished under his own name dozens of articles previously published by others. • Other researchers falsified or fabricated research results. • It seemed as if research institutions ignored or deliberately covered up problems. • Eventually, Congress stepped in and required Federal agencies and research institutions to develop research misconduct policies.

  3. Purpose of Research Misconduct Policies • Establish definitions for research misconduct • Outline procedures for reporting and investigating misconduct • Provide protection for whistleblowers and persons accused of misconduct

  4. Federal Policy on Research Misconduct Federal Register October 14, 1999 Vol. 64 No. 198

  5. Research Misconduct Defined • Research misconduct is defined as FFP • Fabrication, Falsification, or Plagiarism in proposing, performing or reviewing research, or in reporting research results (“hoaxing, forging, trimming, and cooking”). • The data may be in laboratory notebooks, grant applications, progress reports to NIH, publications, patent applications or similar documents. • If non-rigorous practices are used to conceal and/or misrepresent, this could amount to “falsifying” and thus be scientific misconduct.

  6. Research Misconduct Defined • Fabrication is making up results and reporting them • Falsification is manipulating research materials, equipment, or processes, or changing or omitting data • Plagiarism is the appropriation of another person's ideas, processes, results, or words without giving appropriate credit, including those obtained through confidential review of others' research proposals and manuscripts

  7. Legal Criteria for Research Misconduct • Represents a significant departure from accepted practices • Has been committed intentionally, or knowingly, or recklessly; and • Can be proven by a preponderance of evidence • What is NOT MISCONDUCT: honest, unintentional error or differences of opinion

  8. Plagiarism is the Most Common Research Misconduct • 25 percent of the allegations received by the ORI in the last three years • 60 percent of the allegations received by the National Science Foundation.

  9. NIH Requires Instruction in RCR • Since 1990, the NIH has required all applications for NRSA Training Grants (T32, T34) to provide instruction in RCR • It also applies to all Fellowships (F & K awards)

  10. NSF Also Requires RCR Instruction • The NSF requirement applies to all proposal to conduct research (not just training grants and fellowships) • This requirement was established in 2010.

  11. Top ten “POOR” behaviors 1. Falsifying or ‘cooking’ research data 2. Ignoring major aspects of human-subject requirements 3. Not properly disclosing involvement in firms whose products are based on one‘s own research 4. Relationships with students, research subjects or clients that may be interpreted as questionable 5. Using another’s ideas without obtaining permission or giving due credit (plagiarism) 6. Unauthorized use of confidential information in connection with one’s own research 7. Failing to present data that contradict one’s own previous research 8. Circumventing certain minor aspects of human-subject requirements

  12. Top ten poor behaviors(continued) 9. Overlooking others' use of flawed data or questionable interpretation of data 10. Changing the design, methodology or results of a study in response to pressure from a funding source (falsification) Other poor behaviors 11. Publishing the same data or results in two or more publications 12. Inappropriately assigning authorship credit 13. Withholding details of methodology or results in papers or proposals 14. Using inadequate or inappropriate research designs 15.Dropping observations or data points from analyses based on something other than defined exclusion criteria 16.Inadequate record keeping related to research projects

  13. II. Findings of Research Misconduct

  14. A finding of research misconduct requires that: • There be a significant departure from accepted practices of the scientific community for maintaining the integrity of the research record • The action be committed intentionally, or knowingly, or in reckless disregard of accepted practices • The allegation be proven by a preponderance of evidence

  15. What does “intentionally” mean? • Intentionally does not mean that the intent was to commit misconduct • Intentionally means that the intent was to perform the act • For example, copying a paragraph without realizing it it is plagiarism is still intentional • It makes no difference if the individual doesn’t realize that the action represents misconduct • Ignorance is not an excuse

  16. Each organization has a separate office monitoring research integrity • NIH • Office of Research Integrity (ORI) • http://ori.hhs.gov • NSF • Office of the Inspector General (OIG) • http://www.oig.nsf.gov

  17. Research misconduct is a minimal standard • The responsibility to avoid misconduct in research is a minimum standard for the responsible conduct of research • The fact that most researchers do not engage in research misconduct does not mean that the level of integrity in research overall is high. This is where issues of “scientific rigor” come in.

  18. Scientific rigor The Reproducibility Crisis Identifying factors/practices that contribute to lack of reproducibility Defining a core set of standards for rigorous study design Identifying and coping with perverse incentives

  19. The “Reproducibility Crisis” • Multiple reports from different fields that published findings can’t be replicated. • High profile replication attempts have been unable to reproduce most studies that were examined. • Congress has noticed and is expressing concerns. • NIH has modified its requirements for proposals by requiring four new sections to address rigor and is requiring training in scientific rigor for training grants. • Journals have changed and are continuing to change their submission requirements and review practices.

  20. The Reproducibility Crisis •”Why Most Published Research Findings are False” has been cited >4000 times (PLoS Med 2:e124, 2005). •Empirical attempts to replicate published studies in various fields find that a large percentage of findings cannot be replicated, even by the original researchers (e.g., Science 349, 910-, 2015). •Prinzet al. (2011) Believe it or not: How much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery. Inconsistencies were found in 2/3 studies. • Begley and Ellis (2012) Raise standards for preclinical cancer research. Nature. Only 6/53 “landmark” papers replicated. •In neuroscience, there is published evidence that most animal studies lack sufficient statistical power to generate reliable conclusions (Nat. Rev. Neurosci. 14, 365-, 2013) and that the statistical analyses used in many fMRI studies result in false-positive rates of >50% (PNAS 113, 7900-, 2016). •It has been estimated that $28 billion per year in preclinical research funding is spent on irreproducible studies (PLoS Biology 13[6], 2015) and that 85% of research resources are wasted (The Lancet 383, 101-, 2014).

  21. The Reproducibility Crisis (continued) •The reproducibility problem has been widely covered in the general-interest press (e.g., The Economist, Oct. 19, 2013). •The NIH, the NSF, the National Academies, and some scientific societies have acknowledged the problem and began to advance solutions, which will necessarily involve universities, scientific societies, scholarly journals, and research funding agencies.

  22. Public awareness of lack of reproducibility has eroded faith in science. Nature, 543, March 30, 2017

  23. Terminology • Recommendations from the • Federation of American Societies for Experimental Biology • Scientists, policy makers, and journalists should use precisely defined terms and definitions when discussing research rigor and transparency to promote understanding. • Replicability: the ability to duplicate (i.e., repeat) a prior result using the same source materials and methodologies. This term should only be used when referring to repeating the results of a specific experiment rather than an entire study. • Reproducibility: the ability to achieve similar or nearly identical results using comparable materials and methodologies. This term may be used when specific findings from a study are obtained by an independent group of researchers. • c. Generalizability: the ability to apply a specific result or finding more broadly across settings, systems, or other conditions. • d. Translatability: the ability to apply research discoveries from experimental models to human health applications • e. Rigor: the use of unbiased and stringent methodologies to analyze, interpret, and report experimental findings • f. Transparency: the reporting of experimental materials and methods in a manner that provides enough information for others to independently assess and/or reproduce experimental findings.

  24. -Improving scientific rigor requires recognition and avoidance of practices that compromise rigor. What are these practices?

  25. Identifying factors/practices that contribute to lack of reproducibility • Begley (2013) Six red flags for suspect work. Nature. • -Were experiments performed blinded? • -Were basic experiments repeated? • -Were all the results presented? • -Were there positive and negative controls? • -Were reagents validated? • -Were statistical tests appropriate?

  26. Practices that compromise scientific rigor: • -Failure to delineate the starting and stopping points of an experiment • -Testing to a foregone conclusion • -Bias • -Lack of contemporaneous controls • -Improper data exclusion • -Improper statistics • -p-hacking • -insufficient statistical power • -Selective reporting • -Lack of self-replication • -Lack of transparency • -Publication bias • -Perverse incentives

  27. Practices that compromise scientific rigor: • -Failure to delineate the starting and stopping points of an experiment. Many studies involve pooling data from experiments done over time and then compiling groups at the end.  This is especially problematic for interventions that take time to produce (animal models of neurological disorders). • -Testing to a foregone conclusion. This involves doing interim statistical analyses and increasing "n" until a significant effect is seen.  This related to the first because most studies in academic labs don’t have stopping rules (defined “n”).   • -Failure to control for unintentional bias, including lack of randomization, lack of blinding, failure to use systematic “unbiased sampling techniques”. • -Failure to include contemporaneous controls. Compiling data from different groups or treatment conditions done at different times (hard to detect in published papers unless the authors tell you). • -Improper data exclusion.

  28. Practices that compromise scientific rigor continued: • -Use of improper statistics (most common example is using t-tests for multiple comparisons). Rule of thumb: if there are more than 2 bars on a graph, the proper statistic is ANOVA. • -p-hacking: If one statistic doesn’t give a statistically significant difference, try another, and another... • -Lack of sufficient statistical power. • -Selective reporting. Failure to report the entire set of analyses in a particular study.   • -Lack of self-replication. • -Lack of transparency: Failure to report methods completely and transparently, especially in terms of pooling data from experiments done at different times, randomization, group compilation. • -Publication bias for positive results. • -Perverse incentives

  29. P-hacking

  30. P-hacking: Some examples. • Trying different analyses/comparisons until you find one that gives a statistically significant difference. • There is a trend in the data, so additional experiments are done to increase “n” (testing to a foregone conclusion). • There is a trend in the data, so “outliers” are excluded so that statistical comparisons become significant. • Lack of transparency in reporting all analyses that were done (required for valid corrections for multiple analyses).

  31. Avoiding P-hacking: • Trying out different analyses/comparisons until you find one that gives a statistically significant difference. • Pre-plan statistical analyses with advice from a biostatistician. • Pre-identify primary vs. secondary measures • There is a trend in the data, so additional experiments are done to increase “n” (testing to a foregone conclusion). • Distinguish between pilot experiments vs. pre-planned analyses. • Define stopping rules. What constitutes an “experiment” for purposes of analysis? • If possible, repeat experiment rather than adding to existing data set. • There is a trend in the data, so “outliers” are excluded so that statistical comparisons become significant. • Prospectively define inclusion/exclusion criteria • Describe handling of outliers • Lack of transparency in reporting all analyses that were done. • Report all analyses including ones that are not shown. This is now a requirement in some journals. • To the extent possible, state that original data will be included in publications or made available on request for future meta-analyses.

  32. Delineating starting and stopping points. • Deciding on the endpoint of “assay development, exploratory observations, preliminary studies, pilot studies, etc.” and the starting point of a formal hypothesis testing study. Do you include data gathered during the preliminary study phase? • Power calculations to pre-define “n”. • Group formulation, repetition, technical vs. biological replicates, pre-planned statistical analysis to avoid “testing to a foregone conclusion.”

  33. Testing to a foregone conclusion • Preliminary/pilot studies often involve collecting some data and then running statistics to test/validate methods and approaches. • A common practice is to fail to distinguish between preliminary studies and a rigorous test of concept. • The problem is especially evident when studies are done over time, accruing data and compiling groups. • If you are doing an experiment, run a statistical analysis, and find that differences are “close”, so you continue to increase “n” to get statistical significance, this is testing to a foregone conclusion. • If you find yourself in this situation, the best practice is to repeat the study if feasible using the “pilot” to determine “n” required for adequate statistical power. • If this is impractical (time, cost, precious resource, etc.) one possible solution is to transparently report that you did an interim statistical analysis and then increased “n”.

  34. Rigorous study design to minimize bias “Bias is unintentional and unconscious. It is defined broadly as the systematic erroneous association of some characteristic with a group in a way that distorts a comparison with another group… The process of addressing bias involves making everything equal during the design, conduct and interpretation of a study, and reporting those steps in an explicit and transparent way.” David F. Ransohoff, 2010. Sources of bias in specimens for research about molecular markers for cancer. J ClinOncol28: 698-704). It’s important to use specific techniques to minimize unintentional bias. Blinding and randomization are examples of techniques, but other formal techniques are important for minimizing sampling bias (for example Stereology).

  35. Steps toward solutions: NINDS convened a workshop in June 2012. Consensus requirements from NINDS workshop: sample size estimation, whether and how animals were randomized, blinding, appropriate data handling (data inclusion, exclusion) and thorough and transparent reporting. Nature, 490, 187-191, 2012.

  36. Some recommendations for best practices for preclinical research in neuroscience Steward and Balice Gordon, 2014, Neuron 84, 572-581 Latin scholars will note this should be “Rigor or Mort”.

  37. A core set of standards for rigorous study design • Randomization • Blinding • Allocation concealment • Blinded outcome assessment • Sample size determination (pre experiment power calculations) • Data handling • Stopping rules, defining what constitutes an “experiment” for purposes of analysis • Prospective inclusion/exclusion criteria • Handling of outliers • Pre-identification of primary outcome measures • Defining what constitutes an “experiment” for purposes of analysis

  38. Considerations for rigorous study design • Pre-experiment power calculations (endpoint sensitivity, variability, effect size, desired level of confidence, definition and rationale for n). • Controls to reduce un-recognized bias in data collection • Random assignment to groups • Procedures to achieve blinding • Data handling and analyses • Positive and negative controls • Thorough and transparent reporting • Steward and Balice-Gordon, (2014) Neuron, 84, 572-581.

  39. Considerations for rigorous study design • Experimental procedures to protect against un-recognized bias. • Is bias minimized through blinding, recoding, and systematic random sampling? • “Bias is unintentional and unconscious. It is defined broadly as the systematic erroneous association of some characteristic with a group in a way that distorts a comparison with another group…The process of addressing bias involves making everything equal during the design, conduct and interpretation of a study and reporting those steps in an explicit and transparent way” (Ransohoff and Gourlay, 2010). • Random assignment to groups. • Allocation concealment • Prospective inclusion/exclusion • Blinding • Blinded outcome assessment • Separation of data collection and analysis. 3rd party data management • Re-coding data • Exceptions to blinding, and resulting interpretive caveats • Biases due to outcome expectation: Is the guiding philosophy to “test” an hypothesis or “prove” an hypothesis? • Steward and Balice-Gordon, (2014) Neuron, 84, 572-581.

  40. Considerations for rigorous study design • Data analysis: • Plan statistical analysis and get consultation BEFORE collecting data. An important part of good statistical analysis is in experiment execution. • Understand corrections for multiple comparisons. If you are collecting different data sets from a single group, this constitutes multiple comparisons even if the data are from different analyses. • Explain how you will avoid “testing to a foregone conclusion”. • Red flag: Collecting data, analyze as you go, and continue to increase “n” until differences are significant. This is the way pilot experiments are often done. The way to avoid the problem is to clearly distinguish between “pilot” experiments and the start of the definitive experiment. • Explain how you will avoid “p-hacking”. • Steward and Balice-Gordon, (2014) Neuron, 84, 572-581.

  41. Journal rules are changing: Failure to follow rigorous practices may mean that your hard work will not be publishable.

  42. Scientific Record Keeping Alan L. Goldin, M.D./Ph.D.

  43. Laboratory Notebooks • Bound, serially numbered pages • All entries should be dated • Permanent ink • Table of contents • Include the actual data, such as photographs, negatives, autoradiograms and printouts

  44. Data in Laboratory Notebooks • Original data should be included • Photographs, negatives and similar can be glued or taped • Other materials can be inserted in plastic sleeves (including CD or DVD) • Oversize material and magnetic media should be stored, with the location and coding scheme included in the lab book

  45. Laboratory Notebook Requirements Can be More Detailed • Data book paper should be acid-free • Bindings should be sewn or glued • Plastic comb, wire spiral, or ring binders are considered unacceptable • Data books may be inventoried • Master data book log • This policy applies in industry

  46. Best Practices • Schreier A.A., Wilson K., Resnik D. • Academic Research Record-Keeping: Best Practices for Individuals, Group Leaders, and Institutions. • Academic Medicine : Journal of the Association of American Medical Colleges. 2006;81(1):42-47

  47. Policies in Industry • Only bound laboratory notebooks are acceptable • Entries must be countersigned weekly or more often • The rules are stricter because the notebooks may be used as evidence to gain patent protection

  48. How long to keep notebooks? • NIH policy mandates 3 years after the end of the project (grant funding period) • FDA policy mandates 10 years after use • Patent policy mandates 23 years after issue of the patent • The organization with the longest policy has priority

More Related