1 / 28

Statistics for Social Sciences I (E563)

Statistics for Social Sciences I (E563). Prof. Sudip Ranjan Basu , Ph.D 25 September 2008. Think about these bar diagrams …. « A statistical tie ». Measurement in Statistics. Concepts of measurement: Measurement: a very specific process to assigning number to a variable

rhea
Download Presentation

Statistics for Social Sciences I (E563)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics for Social Sciences I (E563) Prof. Sudip Ranjan Basu, Ph.D 25 September 2008

  2. Think about these bar diagrams… « A statistical tie » Lecture 2-Sudip R. Basu

  3. Measurement in Statistics • Concepts of measurement: • Measurement:a very specific process to assigning number to a variable • Assignment by category (categorical/qualitative-attributes) • Assignment by amount • assignment of a person to a particular category or a variable • Validity: • to describe the objective and accurately reflect the concept • to measure by a particular scale or index • Face validity/Content validity/Criterion validity/Construct validity • Reliability: • to have consistency of the data collected • likelihood that the scale is actually measuring what it is supposed to measure • Free of measurement errors • Split-half reliability/test-retest reliability Lecture 2-Sudip R. Basu

  4. Forms of ‘variable’ • Variables: Concepts that vary, or change, from one observation to another in a sample or population • Measurement scale differs • Different statistical methods to apply to Quantitative and Qualitative variables Lecture 2-Sudip R. Basu

  5. Sales of measurement • Qualitative variable: • Unordered/nominal scale • Primary mode of transportation • (Bus, tram, bicycle, walk) • Qualitative variable: • Ordered/ordinal scale • Involves a rank order or other • ordering • Political philosophy • (Liberal, moderate. conservative) Lecture 2-Sudip R. Basu

  6. Quantitative aspects of ordinal data • Interval scale: • Class interval: An interval that indicates the space between two end points • Qualitative • vary in magnitude • Nominal scale: • Qualitative • vary in quality not in quantity • Ordinal scale: • quantitative-qualitative • vary in quality not in quantity • Each level has a greater or smaller magnitude • Numerical scale by assigning numerical scores to categories • Interval than nominal • Sensitivity analysis Lecture 2-Sudip R. Basu

  7. Discrete and Continuous • Discrete: A set of values form separate numbers, such as 0,1,2,…. • Unit of measurement cannot be subdivided • Number of siblings • Number of visits to a physician last year • Categorical variables-nominal or ordinal • Quantitative variables-discrete (Number of siblings) or continuous (age) • Continuous: An infinite continuum of possible real number values • Any real number possible between two values • Height • Weight Lecture 2-Sudip R. Basu

  8. Summarize types of variables Lecture 2-Sudip R. Basu

  9. Describing data • Categorical data: • Frequency : headcounts or tallies indicating the number of cases in particular category or the total number of cases measured/the number of observations • Scores: Numbers that are used to represent amounts or rankings • Relative frequency • The proportion (# of observations in a category divided by the total number of observations) or percentage (proportion multiplied by 100) of the observations that fall in that category • Sum of proportions equals to 1.00 • Frequency distribution • A tabulation that lists possible values for a variable, together with the number of observations at each level. • Relative frequency distribution • A listing of possible values together with their proportions or percentages • Quantitative data: • Frequency distribution • Intervals of values in frequency distributions are usually of equal width • Mutually exclusive intervals Lecture 2-Sudip R. Basu

  10. Bar graphs Lecture 2-Sudip R. Basu

  11. Comparing groups • Compare: Same variable and different groups • Relative frequency distributions • Histograms • Stem-and-leaf plots Lecture 2-Sudip R. Basu

  12. Population and sample distribution • Sample distribution is a ‘blurry’ picture of the population distribution • As the sample size increases, the sample proportion in any interval gets closer to the true population proportion • Sample distribution population distribution Lecture 2-Sudip R. Basu

  13. Shape of a distribution • Shapes of distributions differ Symmetric Skewed Lecture 2-Sudip R. Basu

  14. SESSION 2 of Lecture 2 Lecture 2-Sudip R. Basu

  15. Working with STATAstata@stata.comhttp://www.ststa.com Lecture 2-Sudip R. Basu

  16. Getting started with STATA • The first four windows open automatically after clicking STATA icon: • The most visible window is the Results Window, which shows results from commands you have typed in the Command Window. • The Command Window is below Results Window where all your commands are typed. • The Review Window lists all typed commands that have been entered from the Command Window. When you click on a command from Review Window, it is pasted into the Command Window. • The Variables Window lists all working variables in the file. Once you click on a variable, and it will appear in the command window. Lecture 2-Sudip R. Basu

  17. STATA window Lecture 2-Sudip R. Basu

  18. Simpel commands • The data editor allows you to enter, view, or edit your working data file. Caution: This window must be closed in order to run commands in STATA. • The do-file editor allowsyou to write, edit, and save STATA commands. STATA commands can be run from the do-file editor. -- files are called do files because they have the file extension .do • Note: STATA treats lines that begin with an asterisk * or text between a pair of /* and */ as comments. Lecture 2-Sudip R. Basu

  19. Save-Close files • Open/Save/Close data file using the icons at the top of the screen-“file” or via commands in the Command Window. • The STATA dataset is saved in the .dta format. • You can use a separate programme called Stat Transfer to translate the dataset from its current format into STATA format. • For large dataset, researchers prefer to use this program. This program retains any variable or value labels from the original file. Lecture 2-Sudip R. Basu

  20. Help-Search • Memoryallows you to handle a large datasets. For example, you can set a memory size of 20m by the following command in the Command Window. .set memory 20m • Help/Search facilities in the STATA allow looking for any command. You can use the help command by simply typing help in the Command Window or using the drop-down Help menu icon, which will open a separate window. You can also type findit commandfor more information. • However, if you do not know the STATA command name you can use the Search facility using the drop-down Help menu icon. For example, if you want help with describe, then you type: .help describe • STATA programme uses simple language syntax. Almost all commands follow the structure: .command variable (variable variable…) , options Lecture 2-Sudip R. Basu

  21. Creating a new dataset • The easy way to create a dataset is to type values for each variable, in columns that STATA automatically calls var1, var2, etc in the Data Editor. Thus, var1 contains names of students; var1 statistics competency; and so forth. • Rename: .rename var1 students .label variable students “Students in Statistics, 2008-2009” • After typing in the information, you close the window and savedata, say .stat2.dta . save stat2 Lecture 2-Sudip R. Basu

  22. Working with Sample • Specifying Subsets of the data: You can restrict to a subset of the data by adding an in or if qualifier, such as using only the 1st through 20th observation, type .list in 1/25 .sort origin .list origin program in 1/25 • The if qualifier also has broad applications, but it selects observations based on specific variable values, such as .summarize if stat==1 Lecture 2-Sudip R. Basu

  23. Describing data • Frequency Tables and Two-Way Cross Tabulations: You can work on Categorical variables for tabulation. Use the dataset stat to tabulate the categorical variable programme: .tabulate programme • You can do cross-tabulation of programme by stat: .tabulate programme stat • You can get column percentages, type .tabulate programme stat, column Lecture 2-Sudip R. Basu

  24. Data tabulation • Multiple Tables and Multi-way Cross-Tabulations: You can work on many different variables, type .tab1 origin programme stat .tab1 programme – education • You can get multiple two-way tables, such as cross-tabulations of every two-way combinations of the listed variables, type .tab2 origin programme stat • To produce multi-way tables, if we do not need percentages or statistical tests, type .table programme , contents (freq) • To produce two-way frequency table or cross-tabulation, type . table origin programme , contents (freq) • To produce a more complicated tables, type . table origin programme , contents (freq) by (stat) Lecture 2-Sudip R. Basu

  25. GRAPHS with STATA • You can draw bar charts, type: .graph bar stat, over (programme) blabel(bar) bar (1, bcolor(gs10)) .graph bar stat, over(programme) legend( label(1 "Frequency")) ytitle("Native Language Speakers") title("Bar diagram of native language speakers, E563") subtitle("by languages") note("Source: Statistics Class 1, SRBasu") .graph bar stat word, over (programme) blabel(bar) bar (1, bcolor(gs10)) bar (2, bcolor (gs7)) • You can draw horizontal bar charts, type: .graph hbar stat, over (programme) blabel(bar) bar (1, bcolor(gs10)) .graph hbar stat word, over (programme) blabel(bar) Lecture 2-Sudip R. Basu

  26. Working with datasets See Week 2 web-course material • Assignment_1 Datasets: 2) Week2_Students Profile 3) Week2_World Socio-economic data Lecture 2-Sudip R. Basu

  27. Note Week 3-2 October • Descriptive Statistics • Measures of Central Tendency and Dispersion, Moments, Skewness, and Kurtosis • Readings: • AF-Chapter 3 (p.39-60) • MS-Chapter 4, MS-Chapter 5 • Assignment: Assignment 2 • Students should turn in his/her own paper in hardcopies to teaching assistant at Rigot Office No. 31 or in class on Thursday 9 October-Week 4. Lecture 2-Sudip R. Basu

  28. Lecture 2-Sudip R. Basu

More Related