1 / 15

Simple tables: Just one variable- univariate data (qualitative/quantitative)

Simple tables: Just one variable- univariate data (qualitative/quantitative) Listing the values with variable description Preparing frequency tables/ distributions. Frequency distribution of categorical variable/discrete variable

melissam
Download Presentation

Simple tables: Just one variable- univariate data (qualitative/quantitative)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simple tables: • Just one variable-univariate data (qualitative/quantitative) • Listing the values with variable description • Preparing frequency tables/ distributions

  2. Frequency distribution of categorical variable/discrete variable • It is a table of frequencies of different values of the categorical variable/discrete. • Ex: The data below present the no. of children /family.2,3,2,2,0,1,2,2,0,2,1,1,2,0,2,3,1,2,2,4,12,1,1,2,1,2,1,2,2,1,0,2,1,1,2,1,2,2,1,1,2,0,3,2,1,1,2,3,2,2,1,2,0

  3. Distribution of number of children per family

  4. Table should be neatly drawn. • It should have a title, table no.,row and /column headings. Total of col./row should be shown ( depending on the problem). • A footnote can be added to give details of codes,source of data and any other special features noted

  5. Usually when we prepare a frequency distribution for categorical data(contingency tables), we show the % values along with the frequencies. • When a cross tab is prepared , relevant marginal totals( % values) should be shown • The marginal totals can be used to cross check the entries and the grand total

  6. A frequency table/distribution of continuous data shows the distribution of the data in bins ( class intervals), usually of equal width. This is a technique of summarising or smoothing data. • Following formulae can be used to decide the no. of class intervals( bins ) • Determine the range of the sample data – R= Max. – min. • Square root formula: k= √n , where n is the number of observations and k is the no. of class intervals. • k= R/ h , where R is the range and h is the suggested bin width . k is approximated to the nearest integer. • ( i.e., for R=43,h=5 k=8.6 ,approximated to 9 )

  7. Sturge’s formula: k= 1+ log 2 n or k = 1 + 3.322 log 10 n This formula works well for n > 30. For n ≤ 30 ,it fails to reflect any trend. It is poor if data are non-normal. Ex: If n= 100 , then k= 1 + 3.322 x 2 = 7.644 ~ 8 After finding the number of bins, we determine the class width ( bin width ) using the formula w= R/k, where w is the bin width and R is the range. Usually we take w adjusted to convenient round figures.

  8. In a data set min. value is 18.7 and max. value is 68.8. If there are 180 observations , determine the number of classes and class intervals using the different formulae. • Using square root formula,k= √ 180 = 13.42 ~ 13. class width,w = R/k = ( 68.8-18.7 )/13= 50.1/13 = 3.85 ~ 4. Hence the class intervals are 18-22, 22-26, 26-30,30-34,34-38,38-42,42-46,46-50,50-54,54-58,58-62,62-66,66-70

  9. Sturge’s formula: k=1+3.322x log 180 = 7.49~7 w= 50.1/7 ~ 7 The classes are : 18 –25, 25-32,32-39,39-46,46-53,53-60,60-67,67-74 You can consider: 18 – 25, 26 – 33, 34 – 41, 42 – 49, 50 – 57, 58 – 65, 66 - 73 Take bin width w=6: then k= 8.51 ~ 9. Classes are: 18-24,24-30,30-36,36-42,42-48, 48-54,54-60,60-66,66-72 Know these concepts : inclusive and exclusive classes, class limits, class boundaries, frequency , cumulative frequency , relative frequency

  10. Age distribution of patients

  11. A thumb rule for deciding the no. of class intervals is to consider not less than six classes and not more than 15 classes. With less than six classes there will be too much of summarisation and more than 15 classes would mean not enough summarisation • The number of class intervals (k) given by different formula need not be taken as final but only as a guidance value. The actual no. of class intervals may be taken around that guidance value. • When it is appropriate,we can select classes with class width 5 or 10 and use class limits beginning and ending with 5 and itsmultiples( or multiples of 10 ) ex.: 0 -5, 5 – 10, 10 – 20, etc.

  12. Decide about the type of class intervals – inclusive type or exclusive type • In some cases ,the first or the last class interval may be an open interval ( why?)

  13. Reference books • You can download the full text of the following books: • Olive Jean Dunn ,Virginia A.Clark: Basic Statistics- A Primer for the Biomedical Sciences , 4th Edition, John Wiley & Sons, 2009. T.D.V.Swinscow, M.J.Campbell : Statistics at Square One,10thEdition,BMJ Books, 2002. • Jennifer Peat, Belinda Barton: Medical Statistics – A Guide to data Analysis and Critical Appraisal, Blackwell Publishing ( BMJ books), 2005. • Chap T. Le : Introductory Biostatistics, Wiley – Interscience, 2003.

  14. Bernard Rosner: Fundamentals of Biostatistics (7th Edition)- Brooks/Cole (2011)

More Related