620 likes | 859 Views
EC339: Applied Econometrics. Introduction. What is Econometrics?. Scope of application is large Literal definition: measurement in economics Working definition: application of statistical methods to problems that are of concern to economists
E N D
EC339: Applied Econometrics Introduction
What is Econometrics? • Scope of application is large • Literal definition: measurement in economics • Working definition: application of statistical methods to problems that are of concern to economists • Econometrics has wide applications—beyond the scope of economics
What is Econometrics? • Econometrics is primarily interested in • Quantifying economic relationships • Testing competing hypothesis • Forecasting
Quantifying Economic Relationships • Outcomes of many policies tied to the magnitude of the slope of supply and demand curves • Often need to know elasticities before we can begin practical analysis • For example, if the minimum wage is raised, unemployment may drop as more workers enter the labor force • However, this depends on the slopes of the labor supply and labor demand curves • Econometric analysis attempts to determine this answer • Allows us to quantify causal relationships when the luxury of a formal experiment is not available
Testing Competing Hypothesis • Econometrics helps fill the gap between the theoretical world and the real world • For instance, will a tax cut impact consumer spending? • Keynesian models relate consumer spending to annual disposable income, suggesting that a cut in taxes will change consumer spending • Other theories relate consumer spending to lifetime income, suggesting a tax cut (especially a “one-shot deal”) will have little impact on consumer spending
Forecasting • Econometrics attempts to provide the information needed to forecast future values • Such as inflation, unemployment, stock market levels, etc.
The Use of Models • Economists use models to describe real-world processes • Models are simplified depictions of reality • Usually an equation or set of equations • Economic theories are usually deterministic while the world is characterized by randomness • Empirical models include a random component known as the error term, or i • Typically assume that the mean of the error term is zero
Types of Data • Data provide the raw material needed to • Quantify economic relationships • Test competing theories • Construct forecasts • Data can be described as a set of observations such as income, age, grade • Each occurrence is called an observation • Data are in different formats • Cross-sectional • Time series • Panel data
Cross-Sectional Data • Provide information on a variety of entities at the same point in time
Time Series Data • Provides information for the same entity at different points in time
Panel (or Longitudinal) Data • Represents a combination of cross-sectional and time series data • Provides information on a variety of entities at different periods in time
Conducting an Empirical Project • How to Write an Empirical Paper • Select a topic • Textbooks, JSTOR, News sources (for ideas), “pop-econ” • Learn what others have learned about this topic • Spend time researching what others have done • Conduct extensive literature review
Conducting an Empirical Project • Theoretical Foundation • Have an empirical strategy • Existing literature may help • Would apply the methods you learn in this book • Gather data and apply appropriate econometric techniques • Interpret your results • Write it up… • Build like a court case or newspaper article
Where to obtain data • How to use DataFerrett CPS.doc • Files for course will be stored on datastor \\datastor\courses\economic\ec339 • You can download all files from book http://caleb.wabash.edu/econometrics/index.htm
Web Links • Resources for Economists on the Internet are available at • www.rfe.org • www.freelunch.com • www.bea.gov, www.census.gov, www.bls.gov
Math Review There is much more to it… but these are the basics you must know
Math Review Differentiation expresses the rate at which a quantity, y, changes with respect to the change in another quantity, x, on which it has a functional relationship. Using the symbol Δ to refer to change in a quantity. Linear Relationship (i.e., a straight line) has a specific equation. As x changes, how does y change? Directly related (x increases, y increases) Inversely related (x increases, y decreases) y x x=0, y=3 or (0,3). x=2, y=3+2(2) or (2,7)
Math Review Derivatives are essentially the same thing. Instead of looking at the difference in y as x goes from 0 to 2, if you look at very small intervals, say changing x from 0 to 0.0001, the slope does not change for a straight line The basic rule for derivatives is that the distance between the initial x and new x approches zero (in what is called the limit) y x x=0, y=3 or (0,3). x=.0001, y=3+2(.0001) or (x,y)=(.0001,3.0002)
Math Review Derivatives have a slightly different notation than delta-y/delta-x, namely dy/dx or f’(x). Constants, such as the y-intercept do not change as x changes, and thus are dropped when taking derivatives. Derivatives represent the general formula to find the slope of a function when evaluated at a particular point. For straight lines, this value is fixed. y x x=0, y=3 or (0,3). x=.0001, y=3+2(.0001) or (x,y)=(.0001,3.0002)
Math Review Integration (or reverse differentiation) is just the opposite of a derivative, you have to remember to add back in C (for constant) since you may not know the “primitive” equation. There are indefinite integrals (over no specified region) and definite integrals (where the region of integration is specified). Also, the result of integration should be the function you would HAVE TO TAKE the derivative of to get the initial function. y 23 3 x 10 Area=[3*(10-0)]+[1/2*(10-0)*(3+2(10))]=130
Basic Definitions • Random variable • A function or rule that assigns a real number to each basic outcome in the sample space • The domain of random variable X is the sample space • The range of X is the real number line • Value changes from trial to trial • Uncertainty prevails in advance of the trail as to the outcome
Case Study Weight Data Introductory Statistics classSpring, 1997 Virginia Commonwealth University
Weight Data: Frequency Table sqrt(53) = 7.2, or 8 intervals; range (260100=160) / 8 = 20 = class width
100 120 140 160 180 200 220 240 260 280 Weight Data: Histogram Number of students Weight * Left endpoint is included in the group, right endpoint is not.
Numerical Summaries • Center of the data • mean • median • Variation • range • quartiles (interquartile range) • variance • standard deviation
Mean or Average • Traditional measure of center • Sum the values and divide by the number of values
Median (M) • A resistant measure of the data’s center • At least half of the ordered values are less than or equal to the median value • At least half of the ordered values are greater than or equal to the median value • If n is odd, the median is the middle ordered value • If n is even, the median is the average of the two middle ordered values
Median (M) Location of the median: L(M) = (n+1)/2 ,where n = sample size. Example: If 25 data values are recorded, the Median would be the (25+1)/2 = 13th ordered value.
Median • Example 1 data: 2 4 6 Median (M) = 4 • Example 2 data: 2 4 6 8 Median = 5 (ave. of 4 and 6) • Example 3 data: 6 2 4 Median 2 (order the values: 2 4 6 , so Median = 4)
Comparing the Mean & Median • The mean and median of data from a symmetric distribution should be close together. The actual (true) mean and median of a symmetric distribution are exactly the same. • In a skewed distribution, the mean is farther out in the long tail than is the median [the mean is ‘pulled’ in the direction of the possible outlier(s)].
Quartiles • Three numbers which divide the ordered data into four equal sized groups. • Q1 has 25% of the data below it. • Q2 has 50% of the data below it. (Median) • Q3 has 75% of the data below it.
Weight Data: Sorted L(M)=(53+1)/2=27 L(Q1)=(26+1)/2=13.5
Variance and Standard Deviation • Recall that variability exists when some values are different from (above or below) the mean. • Each data value has an associated deviation from the mean:
what is a typical deviation from the mean? (standard deviation) small values of this typical deviation indicate small variability in the data large values of this typical deviation indicate large variability in the data Deviations
Variance • Find the mean • Find the deviation of each value from the mean • Square the deviations • Sum the squared deviations • Divide the sum by n-1 (gives typical squared deviation from mean)
Variance Formula Remember that you must find the deviations of EACH x, square the deviations, THEN add them up!
Standard Deviation Formulatypical deviation from the mean [ standard deviation = square root of the variance ]
Variance and Standard DeviationExample from Text Metabolic rates of 7 men (cal./24hr.) : 1792 1666 1362 1614 1460 1867 1439
Variance and Standard DeviationExample Notice the deviations add to zero, so each deviation must be squared
Variance versus Standard Deviation Note: Standard deviation is in the same units as the original data (cal/24 hours) while variance is in those units squared (cal/24 hours)2. Thus variance is not easily comparable to the original data.
Density Curves Example: here is a histogram of vocabulary scores of 947 seventh graders. The smooth curve drawn over the histogram is a mathematical model for the distribution. This is typically written as f(x), also known as the PROBABILITY DISTRIBUTION FUNCTION (PDF)
Density Curves Example: the areas of the shaded bars in this histogram represent the proportion of scores in the observed data that are less than or equal to 6.0. This proportion is equal to 0.303. The area underneath the curve, is called the CUMULATIVE DENSITY FUNCTION (CDF): denoted F(x)
Density Curves Example: now the area under the smooth curve to the left of 6.0 is shaded. If the scale is adjusted so the total area under the curve is exactly 1, then this curve is called a density curve. The proportion of the area to the left of 6.0 is now equal to 0.293.
Density Curves • Always on or above the horizontal axis • Have area exactly 1 underneath curve • Area under the curve and above any range of values is the proportion of all observations that fall in that range
Density Curves • The median of a density curve is the equal-areas point, the point that divides the area under the curve in half • The mean of a density curve is the balance point, at which the curve would balance if made of solid material
Density Curves • The mean and standard deviation computed from actual observations (data) are denoted by and s,respectively. • The mean and standard deviation of the actual distribution represented by the density curve are denoted by µ (“mu”) and (“sigma”),respectively.
Question Data sets consisting of physical measurements (heights, weights, lengths of bones, and so on) for adults of the same species and sex tend to follow a similar pattern. The pattern is that most individuals are clumped around the average, with numbers decreasing the farther values are from the average in either direction. Describe what shape a histogram (or density curve) of such measurements would have.