2.22k likes | 5.09k Views
INTRODUCTION TO BIOSTATISTICS. DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH. This session covers:. Origin and development of Biostatistics Definition of Statistics and Biostatistics Reasons to know about Biostatistics Types of data
E N D
INTRODUCTION TO BIOSTATISTICS DR.S.Shaffi Ahamed Asst. Professor Dept. of Family and Comm. Medicine KKUH
This session covers: • Origin and development of Biostatistics • Definition of Statistics and Biostatistics • Reasons to know about Biostatistics • Types of data • Graphical representation of a data • Frequency distribution of a data
“Statistics is the science which deals with collection, classification and tabulation of numerical facts as the basis for explanation, description and comparison of phenomenon”. ------ Lovitt
Origin and development of statistics in Medical Research • In 1929 a huge paper on application of statistics was published in Physiology Journal by Dunn. • In 1937, 15 articles on statistical methods by Austin Bradford Hill, were published in book form. • In 1948, a RCT of Streptomycin for pulmonary tb., was published in which Bradford Hill has a key influence. • Then the growth of Statistics in Medicine from 1952 was a 8-fold increase by 1982.
C.R. Rao Ronald Fisher Karl Pearson Douglas Altman Gauss -
“BIOSTATISICS” • (1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health. • (2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.
Reasons to know about biostatistics: • Medicine is becoming increasingly quantitative. • The planning, conduct and interpretation of much of medical research are becoming increasingly reliant on the statistical methodology. • Statistics pervades the medical literature.
Example: Evaluation of Penicillin (treatment A) vs Penicillin & Chloramphenicol (treatment B) for treating bacterial pneumonia in children< 2 yrs. • What is the sample size needed to demonstrate the significance of one group against other ? • Is treatment A is better than treatment B or vice versa ? • If so, how much better ? • What is the normal variation in clinical measurement ? (mild, moderate & severe) ? • How reliable and valid is the measurement ? (clinical & radiological) ? • What is the magnitude and effect of laboratory and technical error ? • How does one interpret abnormal values ?
CLINICAL MEDICINE • Documentation of medical history of diseases. • Planning and conduct of clinical studies. • Evaluating the merits of different procedures. • In providing methods for definition of “normal” and “abnormal”.
PREVENTIVE MEDICINE • To provide the magnitude of any health problem in the community. • To find out the basic factors underlying the ill-health. • To evaluate the health programs which was introduced in the community (success/failure). • To introduce and promote health legislation.
WHAT DOES STAISTICS COVER ? Planning Design Execution (Data collection) Data Processing Data analysis Presentation Interpretation Publication
HOW A “BIOSTATISTICIAN” CAN HELP ? • Design of study • Sample size & power calculations • Selection of sample and controls • Designing a questionnaire • Data Management • Choice of descriptive statistics & graphs • Application of univariate and multivariate statistical analysis techniques
TYPES OF DATA • QUALITATIVE DATA • DISCRETE QUANTITATIVE • CONTINOUS QUANTITATIVE
QUALITATIVE Nominal Example: Sex ( M, F) Exam result (P, F) Blood Group (A,B, O or AB) Color of Eyes (blue, green, brown, black)
ORDINAL Example: Response to treatment (poor, fair, good) Severity of disease (mild, moderate, severe) Income status (low, middle, high)
QUANTITATIVE (DISCRETE) Example: The no. of family members The no. of heart beats The no. of admissions in a day QUANTITATIVE (CONTINOUS) Example: Height, Weight, Age, BP, Serum Cholesterol and BMI
Discrete data -- Gaps between possible values Number of Children Continuous data -- Theoretically, no gaps between possible values Hb
CONTINUOUS DATA DISCRETE DATA wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall
Table 1 Distribution of blunt injured patients • according to hospital length of stay
Scale of measurement Qualitative variable: A categorical variable Nominal(classificatory) scale - gender, marital status, race Ordinal (ranking) scale - severity scale, good/better/best
Scale of measurement Quantitative variable: A numerical variable: discrete; continuous Intervalscale : Data is placed in meaningful intervals and order. The unit of measurement are arbitrary. - Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and No implication of ratio (30º C is not twice as hot as 15º C)
Ratio scale: Data is presented in frequency distribution in logical order. A meaningful ratio exists. - Age, weight, height, pulse rate - pulse rate of 120 is twice as fast as 60 - person with weight of 80kg is twice as heavy as the one with weight of 40 kg.
Scales of Measure • Nominal – qualitative classification of equal value: gender, race, color, city • Ordinal - qualitative classification which can be rank ordered: socioeconomic status of families • Interval - Numerical or quantitative data: can be rank ordered and sizes compared : temperature • Ratio - Quantitative interval data along with ratio: time, age.
Frequency Distributions • data distribution – pattern of variability. • the center of a distribution • the ranges • the shapes • simple frequency distributions • grouped frequency distributions • midpoint
Tabulate the hemoglobin values of 30 adult male patients listed below
Steps for making a table Step1 Find Minimum (9.1) & Maximum (15.7) Step2 Calculate difference 15.7 – 9.1 = 6.6 Step3 Decide the number and width of the classes (7 c.l) 9.0 -9.9, 10.0-10.9,---- Step4 Prepare dummy table – Hb (g/dl), Tally mark, No. patients
Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 Total Hb (g/dl) Tall marks No. patients 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 l lll lll llll llll llll lll ll 1 3 6 10 5 3 2 Total - 30 DUMMY TABLE Tall Marks TABLE
Hb (g/dl) No. of patients Table Frequency distribution of 30 adult male patients by Hb 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 1 3 6 10 5 3 2 Total 30
Hb (g/dl) Gender Total Male Female <9.0 9.0 – 9.9 10.0 – 10.9 11.0 – 11.9 12.0 – 12.9 13.0 – 13.9 14.0 – 14.9 15.0 – 15.9 0 1 3 6 10 5 3 2 2 3 5 8 6 4 2 0 2 4 8 14 16 9 5 2 Total 30 30 60 Table Frequency distribution of adult patients by Hb and gender:
Elements of a Table • Ideal table should have Number • Title • Column headings • Foot-notes • Number – Table number for identification in a report • Title,place - Describe the body of the table, variables, • Time period (What, how classified, where and when) • Column - Variable name, No. , Percentages (%), etc., • Heading • Foot-note(s) - to describe some column/row headings, • special cells, source, etc.,
Table II. Distribution of 120 (Madras) Corporation divisions according to annual death rate based on registered deaths in 1975 and 1976 Figures in parentheses indicate percentages
DIAGRAMS/GRAPHS Discrete data --- Bar charts (one or two groups) Continuous data --- Histogram --- Frequency polygon (curve) --- Stem-and –leaf plot --- Box-and-whisker plot
Example data • 68 63 42 27 30 36 28 32 • 79 27 22 28 24 25 44 65 • 43 25 74 51 36 42 28 31 • 28 25 45 12 57 51 12 32 • 49 38 42 27 31 50 38 21 • 16 24 64 47 23 22 43 27 • 49 28 23 19 11 52 46 31 • 30 43 49 12
Histogram Figure 1 Histogram of ages of 60 subjects
Example data • 68 63 42 27 30 36 28 32 • 79 27 22 28 24 25 44 65 • 43 25 74 51 36 42 28 31 • 28 25 45 12 57 51 12 32 • 49 38 42 27 31 50 38 21 • 16 24 64 47 23 22 43 27 • 49 28 23 19 11 52 46 31 • 30 43 49 12
Stem and leaf plot Stem-and-leaf of Age N = 60 Leaf Unit = 1.0 6 1 122269 19 2 1223344555777788888 (11) 3 00111226688 13 4 2223334567999 5 5 01127 4 6 3458 2 7 49
Descriptive statistics report: Boxplot • - minimum score • maximum score • lower quartile • upper quartile • median • - mean • the skew of the distribution:positive skew: mean > median & high-score whisker is longer negative skew: mean < median & low-score whisker is longer
Pie Chart • Circular diagram – total -100% • Divided into segments each representing a category • Decide adjacent category • The amount for each category is proportional to slice of the pie The prevalence of different degree of Hypertension in the population
Bar Graphs Heights of the bar indicates frequency Frequency in the Y axis and categories of variable in the X axis The bars should be of equal width and no touching the other bars The distribution of risk factor among cases with Cardio vascular Diseases
HIV cases enrolment in USA by gender Bar chart
HIV cases Enrollment in USA by gender Stocked bar chart
Graphic Presentation of Data the frequency polygon (quantitative data) the histogram (quantitative data) the bar graph (qualitative data)