1 / 21

An Introduction to Statistical Problem Solving in Geography - 2 nd Edition

An Introduction to Statistical Problem Solving in Geography - 2 nd Edition. Chapter 2 - Summary. Cathy Walker February 13, 2010 GEOG: 3000- Advanced Geographic Statistics Winter Qrtr . 2010; P. Sutton. Definitions.

asasia
Download Presentation

An Introduction to Statistical Problem Solving in Geography - 2 nd Edition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Statistical Problem Solving in Geography - 2nd Edition Chapter 2 - Summary Cathy Walker February 13, 2010 GEOG: 3000- Advanced Geographic Statistics Winter Qrtr. 2010; P. Sutton

  2. Definitions • Individual Level Data Sets - each data value represents an individual element or unit of the phenomenon under study. • Spatially Aggregated Data Sets - each value entered into the statistical analysis is a summary or spatial aggregation of individual units of information for a particular place or area. • Ecological Fallacy - the invalid transfer of conclusions from spatially aggregated analysis to smaller areas or to the individual. • Discrete Variable –a variable that has some restrictions placed on the values the variable can assume. • Continuous Variable - a variable that has an infinite number of possible values along some interval of a real number line. • In general, discrete data are the result of counting or tabulating the number of items, and potential values are limited to whole integers. • Continuous data are the result of measurements, and values can be expressed as decimals. • Quantitative - observations or responses are expressed numerically; units of data are assigned numerical values. • Qualitative – each observation or response is assigned to one of two or more categories.

  3. Four Levels of Measurement

  4. #1 • Each category is given a name or title, but no assumptions are made about any relationships between categories. • Problems based on a nominal scale are considered categorical (qualitative). • Two Necessary Conditions for Nominal Scale Classifications: • Categories are exhaustive; every value or unit of data can be assigned to a category. • Mutually exclusive; it is not possible to assign a value to more then one category because the categories do not overlap. • Examples: • Religious Affiliation Classifications – Baptists, Catholic, Methodist, Presbyterian, Mormon, Jewish, etc. • Political Party Affiliation – Democrat, Republican, Independent Nominal Scale

  5. #2 • Values are placed in rank order. • More quantitative distinctions are possible than with the nominal scale variables. • Strongly Ordered • Each value or unit of data is given a particular position in a rank-order sequence • Weakly Ordered • The values are placed in categories, and the categories themselves are ranked ordered. • Example: Ordinal Scale Top 10 best places to live in the U.S. No. 10: Des Moines, Iowa No. 9: Charlotte, N.C. No. 8: Austin, Texas No. 7: San Antonio, Texas No. 6: Fort Collins, Colorado No. 5: Omaha, Neb. No. 4: Houston, Texas No. 3: Colorado Springs, Colorado No. 2: Boise, Idaho No. 1: Raleigh, N.C.

  6. #3 • Each value or unit is based on a measurement scale, and the interval between any two units of data on this scale can be measured. • The origin or zero starting point is assigned arbitrarily (i.e. the origin does not have a “natural” or “real” meaning. • Example: • The placement of the zero degree point on these temperature scales is arbitrary; zero does not mean a complete lack of heat. Interval Scale

  7. #4 • Each value or unit is based on a measurement scale, and the interval between any two units of data on this scale can be measured. • The origin or zero starting point is “natural” or non-arbitrary, making it possible to determine the ratio between values. • Example: • The measurement of precipitation from a rain gauge; the ratio between 10 inches of rain and 5 inches of rain is precisely 2. Ratio Scale

  8. Measurement Concepts

  9. Precision & Accuracy The Target Analogy Precision – refers to the level of exactness associated with measurement. Accuracy – refers to the extent of system wide bias in the measurement process. It is possible for a measurement to be very precise yet inaccurate. Case 1: Precise, Accurate Case 2: Precise, Inaccurate Case 3: Imprecise, Accurate Case 4: Imprecise, Inaccurate

  10. Validity Addresses the measurement issues on the nature, meaning, or definition of a concept or variable. To express the true meaning of multi-faceted concepts is often to difficult, so geographers often find it necessary to create operational definitions that can serve as indirect or surrogate measures for these variables.

  11. Reliability • Reliability problems often occur when using international data, since fully comparable and totally consistent methods of collecting data rarely exists from country to country. • One way to assess the degree of reliability of a measurement instrument is to compare at least two applications of the data collection method used at different times. When data are collected over time or when changes in spatial pattern are analyzed over time, the geographer must question the consistency and stability of the data.

  12. Basic Classification Methods

  13. Equal Intervals Based on Range To determine class breaks, the range is divided into the desired number of equal-width class intervals The range is simply the difference in magnitude between the smallest and largest values in an interval/ratio set of data.

  14. Equal Intervals Not Based on Range This classification method also designates class breaks to create equal-interval classes, but the exact range is not used to select the class breaks. A convenient and practical interval width is selected arbitrarily, based on rounded-off class-break values. This method if classification is preferred for constructing a frequency distribution, histogram, or ogive to represent the data graphically.

  15. Quantile Breaks The total number of values is divided as equally as possible into the desired number of classes. The allocation of an equal number of values to each category is often an advantage in choropleth mapping, particularly if an approximately equal area on the map is desired for each category. The possible disadvantages of quantile breaks should also be evaluated before deciding to use this method.

  16. Natural Breaks The most elementary natural-breaks method is known as the single-linkage approach. The logic is to identify natural breaks in the data and separate values into different classes based on these breaks. Similar values are kept together in the same category, dissimilar values are separated into different categories, and the gaps in the data are incorporated directly in the grouping procedure. This method will highlight extreme values, placing unusual outliers of data into their own unique categories.

  17. Depending on the classification method used, outcomes can be quite different, even though the same data is used and the same number of classes are created. The logical conclusion is to recognize that any observed spatial pattern (map) is a function of the specific classification method applied and that using a different method of classification will likely result in a visually distinctive map. What Can Be Concluded About The Disparities Among Classification Methods?

  18. Graphic Procedures

  19. Definitions • Histogram - the frequency of values is shown as a series of vertical bars, one for each value or class of values. • When using categories instead of actual values along the horizontal scale of a histogram, classification by equal intervals not based on range is usually the best technique. • Frequency Polygon -very similar to a histogram, except that the vertical position of each data value or class is shown as a point rather than a bar. • Cumulative Frequency Diagram ( or Ogive) - instead of showing actual frequencies for each value or class, this graphic aggregates frequencies from value to value or class to class and displays the cumulative frequencies at each position. • The cumulative absolute frequencies can be divided by the sum of all frequencies to obtain cumulative relative values or proportions. • Scattergram (or Scatterplot) - shows the pattern of association or relationship between two variables ( a bivariate relationship) • If a set of observations is plotted, analysis of the scatter of points suggests the amount and nature of association or relationship that exists between the two graphed variables.

  20. Histogram Frequency Polygon Cumulative Frequency Diagram Scattergram

  21. ?? Questions ??

More Related