Estimation of Authenticity in Statistical Research II: Biostatistics

Estimation of authenticity of results of statistical research(part II)

Biostatistics Commonly the word statistics means the arranging of data into charts, tables, and graphs along with the computations of various descriptive numbers about the data. This is a part of statistics, called descriptive statistics, but it is not the most important part.

Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which data sample was derived Allows general conclusions to be made from limited amounts of data To do this we must assume that all data is randomly sampled from an infinitely large population, then analyse this sample and useresults to make inferences about the population

The most important part The most important part is concerned with reasoning in an environment where one doesn’t know, or can’t know, all of the facts needed to reach conclusions with complete certainty. One deals with judgments and decisions in situations of incomplete information. In this introduction we will give an overview of statistics along with an outline of the various topics in this course.

Basic criteria of authenticity (representation): Error of representation (w) Confiding scopes The coefficient of authenticity (the student criterion) is authenticity of difference of middle or relative sizes (t)

Basic criteria of authenticity (representation): The errors ofrepresentationof /m/ are the degree of authenticity of average or relative value shows how much the results of selective research differ from results which it is possible to get from continuous study of general aggregate.

Basic criteria of authenticity (representation): Confiding scopes – properties of selective aggregate are carried on general one, probability oscillation of index is shown in the general aggregate, its extreme values of minimum and maximal possibility, which the size of general aggregate can be within the limits of.

Basic criteria of authenticity (representation): The coefficient of authenticity (the Student’s criterion) is authenticity of difference of middle or relative sizes (t). The student’s Criterion shows the difference of the proper indexes in two separate selective aggregates.

Measuring the Occurrence of Disease Counting Comparisons Inference Action Cases and populations Measurement Risk Methods - descriptive - analytic Association and causality Generalisability Clinical/health policy Further research

Descriptive Statistics: concerned with summarising or describing a sample eg. mean, median Inferential Statistics: concerned with generalising from a sample, to make estimates and inferences about a wider population eg. T-Test, Chi Square test

Meaning of P P Value: the probability of observing a result as extreme or more extreme than the one actually observed from chance alone Lets us decide whether to reject or accept the null hypothesis P > 0.05 Not significant P = 0.01 to 0.05 Significant P = 0.001 to 0.01 Very significant P < 0.001 Extremely significant

Epidemiological Measurements Rates,Ratios,and Proportions Incidence Rates Prevalence Rates Mortality Rates Fatality Rates Infection Rates

Ratios A ratio expresses the relationship between two numbers in the form x:y or x/y.

Ratios The ratio of male to female births in the United States in 1979 was 1,791,000 : 1,703,000 or 1.052:1. Sex ratio= number of live born males number of live born females

Proportions A proportion is a specific type of ratio in which the numerator is included in the denominator, and the result value is expressed as a percentage. For example,the proportion of all births that were male is : Male births 179×104 = Male+female births (179+170)×104 =51.3%

The proportion of male students of the current class is %.

Proportion of Overweight in children from 7-18 year old, Urumqi, 2003

A rate measures the occurrence of some particular events in a population during a given time period. Particular event: development of disease or the occurrence of death Rates

Rates are defined as follows: Number of events in a specified period ×K Population at risk of these events in a specified period K=100%, 1000‰…

Five components of rate

Rate is The rate is the measure that most clearly expresses probability or risk of disease in a defined population over a specified period of time. In a rate numerator is part of denominator.

What does Rate tell us Rates tell us how fast the disease is occurring in a population. Proportion tell us what fraction of the population is affected.

For example, the death rate from cancer in the United States in 1980 was 186.3 per 100,000 population, the formula: Deaths from cancer among U.S residents in 1980 100,000 × U.S. population in 1980 100,000

Incidence Rates Incidence is defined as the number of new cases of a disease that occur during a specified period of time in a population at risk for developing the disease.

1. Time of onset and the numerator

Denominator is population at risk. Average Population Wecan get this number in two ways. (population in 12.31 of last year+this year)/2 midyear population: 7.31 24:00 3.Specification of Denominator

Prevalence Rates Prevalence measures the number of people in a population who have disease at a given time. Point prevalence Period prevalence

Formula: number of existing cases of a disease at a point in time ×K total population

5 points 1.Numerator It refers to existing cases, currently affected, including new and old cases. No matter when did he get the disease, if only he has disease at the study time,he is one of numerator.

2.Denominator Total population. Not population at risk.

3.A point in time In survey of prevalence rate, time should be very short. Generally, time should be no more than 1 month, such as 1 week or 2 weeks. (point prevalence)

Coefficient of variation is the relative measure of variety; it is a percent correlation of standard deviation and arithmetic average.

Terms Used To Describe The Quality Of Measurements Reliability is variability between subjects divided by inter-subject variability plus measurement error. Validity refers to the extent to which a test or surrogate is measuring what we think it is measuring.

Measures Of Diagnostic Test Accuracy Sensitivity is defined as the ability of the test to identify correctly those who have the disease. Specificity is defined as the ability of the test to identify correctly those who do not have the disease. Predictive values are important for assessing how useful a test will be in the clinical setting at the individual patient level. Thepositive predictive valueis the probability of disease in a patient with a positive test. Conversely, the negative predictive valueis the probability that the patient does not have disease if he has a negative test result. Likelihood ratioindicates how much a given diagnostic test result will raise or lower the odds of having a disease relative to the prior probability of disease.

Measures Of Diagnostic Test Accuracy

Expressions Used When Making Inferences About Data Confidence Intervals The results of any study sample are an estimate of the true value in the entire population. The true value may actually be greater or less than what is observed. Type I error (alpha) is the probability of incorrectly concluding there is a statistically significant difference in the population when none exists. Type II error (beta) is the probability of incorrectly concluding that there is no statistically significant difference in a population when one exists. Power is a measure of the ability of a study to detect a true difference.

Multivariable Regression Methods Multiple linear regression is used when the outcome data is a continuous variable such as weight. For example, one could estimate the effect of a diet on weight after adjusting for the effect of confounders such as smoking status. Logistic regression is used when the outcome data is binary such as cure or no cure. Logistic regression can be used to estimate the effect of an exposure on a binary outcome after adjusting for confounders.

Survival Analysis Kaplan-Meier analysis measures the ratio of surviving subjects (or those without an event) divided by the total number of subjects at risk for the event. Every time a subject has an event, the ratio is recalculated. These ratios are then used to generate a curve to graphically depict the probability of survival. Cox proportional hazards analysis is similar to the logistic regression method described above with the added advantage that it accounts for time to a binary event in the outcome variable. Thus, one can account for variation in follow-up time among subjects.

Kaplan-Meier Survival Curves

Why Use Statistics?

Descriptive Statistics Identifies patterns in the data Identifies outliers Guides choice of statistical test

Estimation of Authenticity in Statistical Research II: Biostatistics

Estimation of Authenticity in Statistical Research II: Biostatistics

Presentation Transcript

The statistical analysis of personal network data

Part II of II

Export Control Responsible Conduct of Research (RCR): * Data Management * Conflict of Interest * Research Miscondu

Basics of Statistical Estimation

Overview of Maximum Likelihood Estimation: Part II

Authenticity of Hadith

STATISTICAL ANALYSIS OF EXPERIMENTAL RESULTS

STATISTICAL INFERENCE PART II SOME PROPERTIES OF ESTIMATORS

Part 3: Estimation of Parameters

Part I Self Introduction Part II Background of research Information of Computer Simulation

results of the PE research

STATISTICAL INFERENCE PART II POINT ESTIMATION

The Research Methods of Biopsychology: Part II

Interval Estimation Part II

Results of Training Survey

Statistical Interpretation of Entropy

Export Control Responsible Conduct of Research (RCR): * Data Management * Conflict of Interest

Results of World War II

Responsible Conduct of Research: Human Subjects Protection Use of Animals in Research

Basics of Statistical Estimation

Ethics of statistical research

STATISTICAL INFERENCE PART II SOME PROPERTIES OF ESTIMATORS