1 / 63

Computing in Archaeology

Computing in Archaeology. Session 11. Correlation and regression analysis. © Richard Haddlesey www.medievalarchitecture.net. Lecture aims. To introduce correlation and regression techniques. The scattergram.

tovah
Download Presentation

Computing in Archaeology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey www.medievalarchitecture.net

  2. Lecture aims • To introduce correlation and regression techniques

  3. The scattergram • In correlation, we are always dealing with paired scores, and so values of the two variables taken together will be used to make a scattergram

  4. example • Quantities of New Forrest pottery recovered from sites at varying distances from the kilns

  5. Negative correlation Here we can see that the quantity of pottery decreases as distance from the source increases

  6. Positive correlation Here we see that the taller a pot, the wider the rim

  7. Curvilinear monotonic relation Again the further from source, the less quantity of artefacts

  8. Arched relationship (non-monotonic) Here we see the first molar increases with age and is then worn down as the animal gets older

  9. scattergram • This shows us that scattergrams are the most important means of studying relationships between two variables

  10. REGRESSION • Regression differs from other techniques we have looked at so far in that it is concerned not just with whether or not a relationship exists, or the strength of that relationship, but with its nature • In regression analysis we use an independent variable to estimate (or predict) the values of a dependent variable

  11. Regression equation y = f(x) • y = y axis (in this case the dependent • f = function (of x) • x = x axis

  12. y = f(x) y = x y = 2x y = x2

  13. General linear equations • y = a + bx • Where y is the dependent variable, x is the independent variable, and the coefficients a and b are constants, i.e. they are fixed for a given data

  14. Therefore: • If x = 0 then the equation reduces to y = a, so a represents the point where the regression line crosses the y axis (the intercept) • The b constant defines the slope of gradient of the regression line • Thus for the pottery quantity in relation to distance from source, b represents the amount of decrease in pottery quantity from the source

  15. y = a + bx

  16. least-squares

  17. least-squares

  18. least-squares

  19. least-squares

  20. y = a + bx

  21. y = a + bx

  22. y = 102.64 – 1.8x

  23. CORRELATION

  24. CORRELATION 1 correlation coefficient

  25. CORRELATION 1 correlation coefficient 2 significance

  26. CORRELATION • 1 correlation coefficient • r • 2 significance

  27. CORRELATION • 1 correlation coefficient • r • -1 to +1 • 2 significance

  28. Levels of measurement: • nominal – in name only • ordinal – forming a sequence • interval – a sequence with fixed distances • ratio – fixed distances with a datum point

  29. Levels of measurement: • nominal • ordinal • interval • ratio

  30. Levels of measurement: • nominal • ordinal • interval Product-Moment • Correlation Coefficient • ratio

  31. Levels of measurement: • nominal • ordinal Spearman’s Rank • Correlation Coefficient • interval • ratio

  32. The Product-Moment Correlation Coefficient

  33. sample – 20 bronze spearheads length (cm) width (cm) n=20

  34. r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] length (cm) width (cm) n=20

  35. r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] n=20

  36. r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] n=20

  37. r = nΣxy – (Σx)(Σy) g= +0.67 √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] n=20

  38. Test of product moment correlation coefficient

  39. Test of product moment correlation coefficient H0 : true correlation coefficient = 0

  40. Test of product moment correlation coefficient H0 : true correlation coefficient = 0 H1 : true correlation coefficient ≠ 0

  41. Test of product moment correlation coefficient H0 : true correlation coefficient = 0 H1 : true correlation coefficient ≠ 0 Assumptions: both variables approximately random

  42. Test of product moment correlation coefficient H0 : true correlation coefficient = 0 H1 : true correlation coefficient ≠ 0 Assumptions: both variables approximately random Sample statistics needed: n and r

More Related