200 likes | 482 Views
Understanding Data. Data and Analysis. Agenda. A Modicum of Probability & Statistics Review in GBA 412 The Interview Riddle Understanding Data Types A Preview things to come …. Probability. What is a Probability? Dependence and Conditioning Expected Values and Covariances
E N D
Understanding Data Data and Analysis
Agenda • A Modicum of Probability & Statistics • Review in GBA 412 • The Interview Riddle • Understanding Data Types • A Preview things to come …
Probability • What is a Probability? • Dependence and Conditioning • Expected Values and Covariances • Probability Densities and Distributions • Sample measures • Mean, Median & Mode • Standard Deviation and Variance • Correlation
Statistics • Central Limit Theorem • Hypothesis Testing • P Values • Confidence Intervals • Statistical Significance • Estimation • Point Estimation • Interval Estimation
The Interview Riddle Reading between the lines…
Solving the Riddle… • Frame the Problem • Construct a Model • Methods? • Data? • Analysis • Results
Takeaways… • Data needs to be constructed • Data isn’t obvious • Data isn’t always numerical • Absence of data is also data • Obtaining relevant data may need expertise • Data is a function of human endeavor
Data Types • Types based on “purpose” and “source” • Primary Data • Data obtained by you for your own particular needs • Examples: • Data from Lab Experiments • Survey conducted by you • Secondary Data • Data collected by someone else for some other purpose • Examples: • Census Data • Syndicated Data (Scanner/Stock Market)
Types of Data • Types based on “action” • Survey Data • Data based on what people said • Examples: • How likely are you to buy? • How satisfied are you with your purchase? • Behavioral Data • Data reflecting what people did • Examples: • Supermarket Scanner Data • Stock Market Data
Types of Data • Types based on “mathemetical properties” • Four types: • Nominal Data • Ordinal Data • Interval (Scale) Data • Ratio Data • Data often determines type of models and analysis to be used
Nominal Data • Nominal data is data that can be used only for identification purposes • It has no mathematical meaning • Consequently, mathematical operators such as addition, subtraction, division etc. cannot be applied • Analysis tools are usually those that aggregate, count or otherwise describe such data. • Easiest data type to collect • Examples: • Social Security Numbers, Phone Numbers
Rank Order Data • Data that reveals the order of a set of items • Apart from the order there is no additional mathematical information in the numbers • Again, mathematical operations do not apply • Analysis tools include descriptive methods and certain specific causal techniques • Examples: • Business School Ranks • Race Positions
Interval Scale Data • Data that seeks to put information on a defined interval scale • The data has limited mathematical properties. • Differences in the numeric values have meaning but ratios do not. i.e. addition/subtraction are OK but division/multiplication are not • Analysis tools include descriptive, relational and causal methods. • Examples: • Satisfaction Survey Answers • Mutual Fund Ratings
Ratio Scale Data • The most flexible data type • The continuous nature of the data allow all mathematical manipulations • Analysis tools are varied and include numerous descriptive and causal methods • Probably the most difficult data to collect • Examples: • Stock Price • Income
Types of Data • Types based on “unit(s) of analysis” • Cross-sectional Data • Data in which each unit of analysis only occurs once • Examples: • Satisfaction Survey across people • Sales/Prices of Pepsi across Markets • Time Series Data • Data for one entity but across many time periods • Examples: • US GDP for the past 100 years • Dow over time • Panel Data • Combination of cross-sectional and time series data where many entities have data over many time periods • Examples: • Stock prices of various firms over time • Market shares of competitors in many markets over time
What’s to come … • GBA 412 Lab • Introduction to probability and statistics (Week I and II) • GBA 412 • Basic Data Analysis • Pictures and Tables