300 likes | 480 Views
Quantitative Analysis. Quantitative / Formal Methods. objective measurement systems graphical methods statistical procedures. why bother?. description esp. of populations ex: average height of people in room inference describe populations on the basis of samples
E N D
Quantitative / Formal Methods • objective measurement systems • graphical methods • statistical procedures
why bother? • description • esp. of populations • ex: average height of people in room • inference • describe populations on the basis of samples • test hypothesis about populations • estimate levels of uncertainty associated with inferential description
exploratory analysis • pattern searching/recognition • “data mining” • evaluate strength of patterning…
“Patterning” • patterning = departures from randomness • strength of patterning = ? degree of departure from randomness…
“how likely is it that observed patterning could have occurred by chance??” • this is a statistical question…
“is the patterning strong enough to either require or support an explanatory argument??” • this is usually an anthropological question…
case variable data matrix attribute aggregation stratification accuracy precision basic vocabulary
case • equivalent to ‘record’ • something about which we want to make/record observations… • variable • kinds of observations we want to make/record • measurements of variability among cases…
(data matrix) cases and variables
attribute • the intersection between cases and variables • i.e., an observation about a specific case with reference to a specific variable • ex. • “elk” • “strongly agree” • “plain-ware” • also called ‘value’, or ‘variable state’
aggregation • grouping cases, usually on the basis of a shared attribute • spatial proximity, temporal proximity • gender of interment associated with grave lots • stratification • dividing cases into sub-groups • usually to carry out parallel analyses that relate to different control conditions
accuracy • an expression of the closeness between a measured (or computed) value and the truevalue • frequently confused with precision • precision • has to do with replicability • the closeness of repeated measures to the same value (not necessarily the true value)
scales of measurement • presence / absence data • simply whether or not the case exhibits a specific state • nominal data • contrasting groups, usually mutually exclusive • sometimes referred to as ‘discrete’ or ‘categorical’ data
scales of measurement • ordinal data • a logical order or ranking exists among the various categories • no assumptions implied about the ‘measurement space’ occupied by categories • ratio data • also metric, continuous • has a non-arbitrary zero • can meaningfully compare measurements as ratios
scales of measurement • interval data • distances between categories of measurement are fixed and even (unlike ordinal data) • scale lacks a non-arbitrary ‘zero’ (unlike ratio data) • count data • derived from nominal data • really a kind of ratio data created by aggregation
Drennan • distinctions are inconsistent and not too important… • measurements vs. categories • measurements: quantities measured along a scale • categories: +/- equivalent to nominal data • counts: discrete enumeration • but, confusion does occur… • ex. can’t use ‘goodness of fit’ tests on nominal data!
data coding • presence / absence data • can use 0 / 1 (but analyze with care!) • nominal data • OK to use integers (1, 2, 3, etc.) • but don’t subject them to arithmetic operations • don’t assume rules of numerical distance
data coding • ordinal data • use integers… • ratio / metric data • use integer or decimal notation • don’t record spurious levels of accuracy or precision • note: x = 10.2 means 10.15 < x < 10.25
coding “missing data” • MD more problematic than most realize… • may want more than one code: • variable state is uncertain, vs. • variable doesn’t apply, vs. • variable state is not present (not really MD) • R gives you one coding option (“NA”)
recoding data • can readily recode “down” the scale (ex. ratioordinal) • implies a loss of information and a probably wasted recording effort • reporting apparently dubious counts as presence/absence data is not a good idea • moving ‘up’ the scale means redoing lab work…
data management • three main options for electronic storage of data: • spreadsheet • statistics package • database
organized by cells • no restrictions on cell content • most useful for short-term manipulation of small datasets • poor for long-term storage of complex datastructures ‘spreadsheet’
data forms offer less versatility than spreadsheets • organized by case & variable • powerful analytical tools • poor management tools ‘stat-pac’
pottery design elements:‘reptile eye’‘obsidian knife’‘cloud motif’ etc….
“structured query language” (SQL) SELECT artifacts.catNum, [design elements].abbrevFROM [design elements] INNER JOIN (artifacts INNER JOIN [design element link] ON artifacts.ID = [design element link].artID) ON [design elements].ID = [design element link].deID;