1 / 65

Welcome

Welcome. To. A Session On. Regression Analysis. Your Text Book. BUSINESS. STATISTICS. By. S.P. GUPTA & M.P.GUPTA. (Fourteenth Enlarged edition). Chapter 8. What does the term Regression literally mean?. The term “ regressions” literally mean “stepping back towards the average”.

Download Presentation

Welcome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome To A Session On Regression Analysis

  2. Your Text Book BUSINESS STATISTICS By S.P. GUPTA & M.P.GUPTA (Fourteenth Enlarged edition) Chapter 8

  3. What does the term Regression literally mean? The term “ regressions” literally mean “stepping back towards the average”.

  4. What is Regression? Regression is a statistical tool to estimate (or predict) the unknown values of one variable from known values of another variable. Example: If we know that advertising and sales are correlated, we may find out the expected amount of sales for a given advertising expenditure or the amount of expenditure for achieving a fixed sales target.

  5. What is the utility of studying? The regression analysis is a branch of statistical theory that is widely used in almost all the scientific disciplines. In economics it is the basic technique for measuring or estimating the relationship among economic variables that constitute the essence of economic theory and economic life. Example: If two variables price (X) and demand (Y) are closely related one can find out the most probable value of X for a given value of Y or the most probable value of Y for a given value of X. Similarly, if the amount of tax and the rise in the price of a commodity are closely related, it is possible find out the expected price for a certain amount of tax levy.

  6. Distinguish between correlation and regression analysis? • The points of difference between correlation and regression analysis are: • While correlation co-efficient is a measure of degree of relationship between X and Y, the regression analysis helps study the ‘nature of relationship’ between the variables. • The cause and effect relation is more clearly indicated through regression analysis than by correlation. • Correlation is merely a tool of ascertaining the degree of relationship between two variables and, therefore, one can not say that one variable is the cause and the other the effect, while regression shows the extent of dependence of one variable on another.

  7. What are Regression Lines? If we take the case of two variables X and Y, we shall have two regression lines as regression line of X on Y and the regression line of Y on X. The regression line of Y on X gives most probable values of Y for given values of X and the regression line of X on Y gives most probable values of X for given values of Y. Thus we have two regression lines.

  8. Under what conditions can there be one regression line? When there is either perfect positive or perfect negative correlation between the two variables, the two correlation lines will coincide i.e. we will have one line. The further the two regression lines are from each other, the lesser is the degree of correlation and the nearer the two regression lines to each other, the higher is the degree of correlation. If variables are independent, r is zero and the lines of regression are at right angles i.e. parallel to X – axis and Y – axis.

  9. What are regression equations? Regression Equations are algebraic expressions of regression lines. Since there are two regression lines there are two regression equations–the regression equation of X on Y is used to describe the variations in the values of X for given change in Y and the regression equation of Y on X is used to describe the variation in the values of Y for given changes in X.

  10. Why is the line of ‘best fit’? The line of regression is the line which gives the best estimate to the value of one variable for any specific value of other variable. Thus the line of regression is the line of “best fit” and is obtained by the principles of least squares.

  11. What is the general form of the regression equation of Y on X? The general form of the linear regression equation of Y on X is expressed as follows: Ye = a+bX, where Ye = dependent variable to be estimated X = independent variable. In this equation a and b are two unknown constants (fixed numerical values) which determine the position of the line completely. The constants are called the parameters of the line. If the value of either one or both of them are changed, another line is determined.

  12. What is the general form of the regression equation of Y on X? The parameter ‘a’ determines the level of the fitted line (i.e. the distance of the line directly, above or below the origin). The parameter ‘b’ determines the slope of the line i.e. the change in Y for unit change in X. To determine the values of ‘a’ and ‘b’, the following two normal equations are to be solved simultaneously. Y=Na+bX, XY=aX +bX2.

  13. What is the general form of the regression equation of X on Y? The general form of the regression equation of X on Y is expressed as follows: X = a+bY To determine the values of ‘a’ and ‘b’, the following two normal equations are to be solved simultaneously. X = Na+bY, XY = aY+bY2. Continued……

  14. What is the general form of the regression equation of X on Y? These equations are usually called the normal equations. In the equations X, Y, XY, X2, indicate totals which are computed from the observed pairs of values of the two variables X and Y to which the least squares estimating line is to be fitted and N is the total number of observed pairs of values. The geometrical presentation of the linear equation , Y = a+bX is shown in the diagram below:

  15. Y Y=a+bX b 1 UNIT IN X o X

  16. It is clear from this diagram, the height of the line tells the average value of Y at a fixed value of X. When X=0, the average value of Y is equal to a . The value of a is called the Y- intercept since it is the point at which the straight line crosses the Y- axis. The slope of the line is measured by b, which gives the average amount of change of Y per unit change in the value of X. The sign of b also indicates the type of relationship between Y and X.

  17. How are the values of ‘a’ and ‘b’ obtained to determine a regression completely? The values of ‘a’ and ‘b’ are obtained by “the method of least squares” which states that the line should be drawn through the plotted points in such a manner that the sum of the squares of the vertical deviations of the actual Y values from the estimated Y values is the least, or in other words, in order to obtain a line which fits the points best, (Y–Ye)2 should be minimum. Such a line is known as line of best fit.

  18. How can the normal equations be arrived at? Continued……

  19. Let S = (Y –Ye)2 = (Y –a – bX)2 ( Ye = a+bX) Differentiating partially with respect to a and b, Continued…………

  20. Example: The following data give the hardness (X) and tensile strength (Y) of 7 samples of metal in certain units. Find the linear regression equation of Y on X. Solution: Regression equation of Y on X is given by Y = a+bX The normal equations are: Y = Na + bX…………………………….. (1) XY = aX+ bX2………………………… (2) Continued…………

  21. Calculation of regression equations Continued……..

  22. Here, N = 7 Substituting the values in equations (1) and (2), we get 572 =7a +1148 b ………………………. (5) 94120 = 1148 a +189280 b ……………… (4) Multiplying the equation (3) by 164, we get 93808 =1148 a +188272 b ………..(5) Subtracting this equation from (4), we get b = 0.31 Putting this value of b in equation (3), we have Continued……..

  23.  The linear regression equation of Y on X is Y = 30.87 + 0.31X

  24. Calculate the regression equations of X on Y and Y on X from the following data Solution: Continued……..

  25. Regression equation of X on Y is given by X= a +bY. The normal equations are X = Na+bY and XY=aY+bY2 Substituting the values, we get 15 =5a+25b …….. (1) 88= 25a+151b ……(2) Solving (1) and (2), we get =0.5 and b =0.5

  26. Hence, the regression equation of X on Y is given by • X = 0.5+ 0.5Y • The Regression equation of Y on X is : Y = a+bX • The normal equations are • Y= Na+ bX • XY = aX + bX2 • Substituting the values, we get • 25 = 5a +15b …………………… (iii) • 88 = 15 a + 55b ……………… (iv) • Solving (iii) and (iv), we get • a = 1.10 and b = 1.3 • Hence, the regression equation of Y on X is given by • Y =1.10 +1.30X Continued……..

  27. What will be the forms of regression equation of X on Y and regression equation of Y on X on the basis of deviations taken from arithmetic Means of X and Y? If we take the deviations of X and Y series from their respective means, the regression equation of Y on X will take the form The value byx can be easily obtained as follows: The two normal equations in terms of x and y will then become

  28. Since x =y =0 (deviations being taken from means) • Equation (1) reduces to Na = 0  a = 0 Equation (2) reduces to

  29. After obtaining the value of byx , the regression equation can easily be written in terms of X and Y by substituting for y, and for x, . Similarly, the regression equations X= a+bY is reduced to = and the value of bxy can be similarly obtained as

  30. Example: The following data give the hardness (X) and tensile strength (Y) 7 samples of metal in certain units. Find the linear regression equation of Y on X. Continued……..

  31. Continued……..

  32. What are regression co–efficients? The quantity b in the regression equations (Y= a+bX and X = a+bY) is called the “regression co–efficient” or “slope co-efficient”. Since there are two regression equations, therefore, there are two regression co-efficients regression co–efficient of X on Y and regression co–efficient of Y on X.

  33. What is regression co–efficient of X on Y? The regression co–efficient of X on Y is represented by the symbol bxy or b1 . It measures the amount of change in X corresponding to a unit change in Y. The regression co–efficient of X on Y is given by When deviations are taken from the means of X and Y, the regression co–efficient is obtained by Continued…..

  34. What is regression co–efficient of X on Y? When deviations are taken from assumed means, the value of bxyis obtained as follows:

  35. What are the assumptions of constructing a regression modal? • The value of the dependent variable, Y, is dependent in some degree upon the value of the independent variable, X. The dependent variable is assumed to be a random variable, but the values of X are assumed to be fixed quantities that are selected and controlled by the experimenter. The requirement that the independent variables assumes fixed values, however, is not a critical one. Useful results can still be obtained by regression analysis in the case where both X and Y are random variables. • The average relationship between X and Y can be adequately described by a linear equation Y= a+bX whose geometrical presentation is a straight line.

  36. The height of the line tells the average value of Y at a fixed value of X. When X= 0, the average value of Y is equal to a. The value of a is called the Y intercept, since it is the point at which the straight line crosses the Y–-axis. The slope of the line is measured by b, which gives the average amount of change of Y per unit change in the value of X. The sign of b also indicates the type of relationship between Y and X. • Associated with each value of X there is a sub– population of Y. The distribution of the sub– population may be assumed to be normal or non– specified in the sense that it is unknown. In any event, the distribution of each population Y is conditional to the value of X.

  37. The mean of each sub- population Y is called the expected value of Y for a given X: . • Furthermore, under the assumption of a linear relationship between X and Y, all values of must fall on a straight line. This is • Which is the population regression equation for our bivariate linear model. In this equation a and b are called the population regression co– efficient. • An individual value in each sub-population Y, may be expressed as:

  38. Where e is the deviation of a particular value of Y from and is called the error term or the stochastic disturbance term. The errors are assumed to be independent random variables because Y’s are random variables and independent. The expectations of these errors are zero; E(e) = 0. Moreover, if Y’s are normal variables, the error can also be assumed to be normal. • It is assumed that the variances of all sub–populations, called variances of the regression, are identical.

  39. What is regression co–efficient of Y on X ? The regression Co–efficient of Y on X is represented by byx or b2. It measures the amount of change in Y corresponding to a unit change in X. The value of byx is given When deviations are taken from the means of X on Y, Continued……..

  40. When deviations are taken from assumed mean

  41. What are the properties of regression co–efficients? • The co–efficient of correlation is the geometric mean of the two regression co–efficients . Symbolically • If one of the regression co–efficients is greater than unity, the other must be less than unity, as the value of the co–efficient correlation cannot exceed unity. • Example: if bxy =1.2 and byx =1.4, Continued……

  42. What are the properties of regression co–efficient? • Both the regression co–efficients will have the same sign i.e. they will be either positive or negative. • The co–efficient of correlation will have the same sign as that of regression co–efficient. • Example: Continued……

  43. What are the properties of regression co-efficient? • The average value of the two regression co–efficients would be greater than the value of co–efficient of correlation. Symbolically, • Example:  r Continued……

  44. Regression co–efficients are independent of change of origin but not scale.

  45. Example: Prove that the co–efficient of correlation is the geometric mean of the regression co–efficient Proof: Let bxy be co–efficient of X on Y and byx be co–efficient of Y on X.  The co–efficient of correlation is the geometric mean of the two regression co–efficients.

  46. Prove that Regression co–efficients are independent of change of origin but not scale.

  47. Example: The following figures relate to advertisement expenditure and corresponding sales Estimate i) The sales for advertisement expenditure of Tk. 80 lakhs and ii) The advertisement expenditure for a sales target of Tk. 25 crores

  48. Let the advertisement expenditure be denoted by X and sales by y calculation of regression equations

More Related