460 likes | 617 Views
Multilevel networks and world ethnography. Doug White and UC team UCI human complexity seminar 1:30 Fri, Sept 24, 2010 UCI grads and undergrads can sign up for 1.33 credits: SOC SCI 240A SEM A (72100). UCI_imbs+UCSD_econ+UCLA_cs, project team. Scott D. White, UCI , One Spot
E N D
Multilevel networks and world ethnography Doug White and UC team UCI human complexity seminar 1:30 Fri, Sept 24, 2010 UCI grads and undergrads can sign up for 1.33 credits: SOC SCI 240A SEM A (72100)
UCI_imbs+UCSD_econ+UCLA_cs, project team Scott D. White, UCI , One Spot B. Tolga Oztan, MBS Ren Feng. Xi’an Xiaotung Univ. Doug White, MBS Assist from Judea Pearl Karim Chalak BU, Econ Halbert White UCSD UCLA Tony Eff, MST,U Econ
Comparing ethnographic data • Early best-described ethnographies give our best chance of understanding the evolution of societies and cultures. • We now have large samples, N=186, 390, 1400, …. • And many variables, V=2000+(SCCS), 500+ (foragers), 150 (EthnoAtlas), …. • Problem is that correlations of variables have no meaning. • Historical interactions have huge effects – splitting, branching , merging, borrowing, migrating, colonizing, conquering. • We still have, among these ethnographic cases, living societies like the Hadza with the genetic stock of humanity’s common ancestors of 150,000 years ago, or the San with the next-split of 120,000 years ago, etc. • It’s not a matter of splitting the historical network interaction and regional similarities from functional and causal relations. (Harold Driver’s struggle with Kroeber) • Its that the statistics for doing so have been so weak as to not be capable of making these inferences, and the concept of inference has been too weak.
Inference, Statistics, Causality • Statistical inference has to do with replicability, or change, given changing conditions. In survey data, it’s hard to get replication because of changing • Composition of the sample: location, composition • Peer effects and interactions within the sample • Time period • So results will vary with different (e.g., cross-cultural) samples . • Looking for invariance is like Norm Schofield ‘s question “What are the causal variables for how people vote” other than the names and parties of the candidates? I.e., no proper nouns in the variables for causality. • When there are peer effects operating in the sample, significance tests are exaggerated (type 1 error). Cross-cultural studies are full of type 1 error. • Random sampling (Ember&Bernard solution) does not solve this problem although means and correlations may be estimated correctly. • But correlations will vary widely with changes in the sample or with time. • replication is thwarted finding differences in replication when none exist (type 2 error) • A stronger concept of inference would deal with causation not correlation. • This began with structural equation models (Sewall Wright 1921, 1935 SEM) and continues with Judea Pearl et al’s extensions of graphical methods
Where we are now: *Rccs* • new program package *Rccs* takes into account network effect matrices (distance, language, etc.) • Computes regression coefficients • R code for 2SLS implemented for classroom use • (see pdfs attached to intersci wiki talk) • Compute causal graph models (in development) • E.g., computes total effects (adding indirect paths) • Chalak and H. White extension: reciprocal effects x-->y plus y-->x-->y as including the indirect path
2-stage ordinary least squares regression with peer effects Stage 1 OLS Calculate the “Instruments” 2nd Stage OLS, Include the “Instruments” Solving Galton’s problem with 2SLS: 2-Stage OLS peer effects X1 II WI independent variables X2 Y dependent variable X3
Causal graph, Pearl’s regression method X Y c b Z “Say for three variables you are trying to estimate the direct effect c of X on Z given an indirect effect of Y. The causal diagram model gives you a license to do it by the regression method, where, for example, (for reference on the pdf) E(y|x, z) – E(y|x´, z) a c = ————————————— (1) x – x´ Controlling for the change from x to x´, E(y|x, z) and E(y|x´, z) are the changes in variable Z due to unit changes in X controlling for Y.” (email from Pearl see Pearl 2000:151, 368; Chalak and White 2010). Because the x,z in (y|x,z) is a joint distribution, eqn (1) means that x→x´ changes y which through the x-y-x path, considered as a joint distribution, changes z. From this it follows, given the single door criterion (Pearl 2000:150) that c + a•b = rxy.z, the coef for total effect of X on Z.
Comparing causality in ethnographic data • R package *Rsccs* takes into account any number of peer effects as Instruments in the previous equations that allow further causal analysis • Regressions change with time periods • Correlate total effects to X →Y (time lagged correlations) • Regressions that yield results for causality can be identified in Pearl’s (2000) single door, back door, and front door causal graph criteria. Some graphical structures may require that some potential confounder variables be blocked, and that direct, indirect and total effects be computed from regression coefficients without those confounders.
Language families as an Instrument for measuring peer effects Multilevel effect language tree
Spatial distances as Instrument for measuring peer effects Standard Cross-Cultural Sample (SCCS wikipedia maps by Tony Eff) Afro-Eurasia drawn to a slightly smaller scale Multilevel effect spatial
World system peer effects -- of exchange – as Instruments Folded image: Core, Semi-periphery1, SP2, P1-2 Core Semi-Peri1 Semi-Peri2 Periphery1 Periphery2 Multilevel effects world system
A structurally endogamous kinship network core of a Turkish nomad clan (White and Johansen 2005: 379; 76-79). Multilevel effects internal networks Up and down effects
13 linked regressions out of 2000+ SCCS variables http://eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm Nodes are variables in regression analyses of variables from the Standard Cross-Cultural Sample of 186 societies (SCCS). Lines represent independent variables. They point down to 13 dependent variables in successive colored layers. Black lines are positive effects, red lines negative effects from regression results. Colors of nodes for variables show depth in a causal hierarchy with net effects estimated as causal graphs (Pearl 2000). At level 4 the Evil eye dependent variable has a triangular relationship with money and milked domestic animals. The regressions control for peer effects of spatial transmission (distance) and cultural transmission (language phylogeny), incorporated as Instrumental Variables in a second-stage regression, with the IVs estimated in a first-stage regression. Node sizes reflect the significance of spatial transmission peer effects. Language effects are sometimes negative.
Paired visual comparison of spatial distributions v1189 Belief in evil eye
v238 Moral gods==4 238. HIGH GODS 18 . = Missing data 68 1 = Absent or not reported 47 2 = Present but not active in human affairs 13 3 = Present and active in human affairs but not supportive of human morality 40 4 = Present, active, and specifically supportive of human morality NOTE the circum-Mediterranean overlap with Evil eye (previous slide)
Paired visual comparison of spatial distributions v1189 Belief in evil eye (dichotomy) Large nodes red Small nodes orange
v155 True money==5 155. SCALE 7- MONEY (here, an independent variable) 77 1 = None 14 2 = Domestically usable articles 43 3 = Alien currency 27 4 = Elementary forms 25 5 = True money NOTE the circum-Mediterranean overlap with Evil eye (previous slide)
Paired visual comparison of spatial distributions v1189 Belief in evil eye
v272 Caste stratification 272. CASTE STRATIFICATION (ENDOGAMY) (two cases have secondary castes) 5 . = Missing data (154) 0 = (Omitted from map) Absent or insignificant 17 1 = Despised occupational group(s) 3 2 = Ethnic stratification 7 3 = Complex NOTE the circum-Mediterranean overlap with Evil eye (previous slide)
Paired visual comparison of spatial distributions v1189 Belief in evil eye
v245 Milked animals NOTE the circum-Mediterranean overlap with Evil eye (previous slide)
v1189 Belief in evil eye R2=0.513; N=186; 10 imputations; standard errors 00R2 adjusted for two-stage least squares. Language non-significant (p > .33). No effect of Islam or Christianity.
v1189 Belief in evil eye Some nonlinear relationships No additional variables Error terms homoskedastic • " " not normally distributed • no " " cultural lag • no " " spatial lag
R2=0.490; N=186; • 10 imputations; standard errors 00R2 adjusted for two-stage least squares. Distance(p > .00002) & language significant (p > .003). v155 Money
No nonlinear relationships Some additional variables Error terms homoskedastic • " " normally distributed • no " " cultural lag • no " " spatial lag v155 Money
v238 Moral gods R2=0.504; N=186; • 10 imputations; standard errors 00R2 adjusted for two-stage least squares. Distance(p < .00001) & language insignificant (p > .15).
v238 Moral gods No nonlinear relationships No additional variables Error terms ~homoskedastic • " " not normally distributed • no " " cultural lag • no " " spatial lag
Transmission effects (Galton’s problem): Spatial and cultural Negative peer effects for language indicate that, for each of these dependent variables, there is a tendency, strong for Money and weak for the other two variables, NOT to be the result of cultural tradition but of innovation that differentiates the societies with Money, Moral gods and Evil eye from the norms in their respective language families. This tendency is nearly significant (value < 0.15) for societies with Moral gods.
Excluding peer effects: Causal graph with multiple triangular 000regression coefficients - numbers are the regression coefficients Causal graph total effects and regression slopes
THESE CAUSALITIES A-E-D-B-C ARE TRANSITIVE, all significant or nearly so, and completely ordered but the arrow from A to B is NEGATIVE A Milked domestic animals E Caste stratification D Moral gods (to money only p <.15) B Money C Evil eye
A 2-slide example for two time periods is next, if time allows (package*Rccs* applies to time series, includes multiplicative interactions as well)
R2 = .672 Data source: Maximizing in Jajmaniland: A Model of Caste Relations. 1968. MARTIN ORANS. American Anthropologist 70(5): 875–897. Causal analysis: Transformation predictions from Indian Jajmani to market system R2 = .623 Correct time 2 predictions match causal inferences P=.067 P=.055 P=.05 p=.067 R2 = .747 Peer effect regression time 1 (Temporal predictions about changes are even stronger)
Causal graphs may incorporate multiplicative or interaction effects, which are used by Martin Orans in his 1968 article. These are diagrammed Jajmani system Power concentration Power concentration Isolation Power concentration Ritual-secular Isolation correlation Jajmani Isolation system None of these models were significant, however, compared to the simple linear additive effects that we tested and found significant (Ren Feng, T. Oztan, D. White)
Further slides, if time allows, show different kinds of analysis than that of *Rccs*
Other kinds of cross-cultural data structures and analyses: Statistical Entailment Analyses: • Society sets for variables tend to form chains of sets ABCD • Galois duality lattice (Concept lattices): • Society sets for variables tend to form chains of sets ABCD and intersections, and opposite ordering of • Sets of variables that tend to form chains of sets • VS1VS2VS3VS4 • Intrasocietal network structure overlays on genealogy • For each society these will define new variables such as • 1) sidedness, reciprocal marriage to opposites. • 2) structurally endogamous groups • 3) marriage-type census as against random simulation • 4) distribution of structural features over generations • Multilevel analysis e.g. regional or world system effects on local societies.
Fig. 3: An exact world entailment digraph for the sexual division of labor Late Task A Early Task B Female Female Male Male
Fig. 3: An exact world ethnographic lattice of kin avoidances has a four-dimensional partial ordering of distributions: 1) parents of Hu, Wi (opp/same sex, within circles), 2) siblings and siblings-in-law of Hu and Wife (opp/same sex, in parallelograms), 3) opposite sex siblings & parents siblings & parallel cousins (White 1995). Lower types of avoidances entail upper ones features in perfect inclusion relations, found by statistical entailment analysis (White 1999b). Of the 250 societies, names attached to each node show each subset of avoidance relations.
1 http://kinsource.net/kinsrc/bin/view/KinSources archives kinship network data contributed by anthropologists. Only three KS ethnographies remain for conversion from paper-based genealogies to e-networks for analysis with Pajek, but others will be added. 2,5 Binford’s (2001) Constructing Frames of Reference forager database has been spreadsheeted by Boehm and Hill. Non-foragers from the SCCS will be analyzed separately. Extensive testing of “peer effects” methods have established their validity. 3 Smith and White (1992) have postwar WS commodity flow time series in 5yr intervals; capital and migration flow will be added. 4 Murdock’s Ethnographic Atlas (EA) in Spss format has been supplemented by newly authored installments 30-31. 5 Murdock and White’s (1969) Standard Cross-Cultural Sample dataset on 186 societies in Spss and R formats has coded data contributions from 80+ different authors on 2008+ variables. Citations to SCCS are now 95+/year and growing.
5 Murdock and White’s (1969) Standard Cross-Cultural Sample dataset on 186 societies in Spss and R formats has coded data contributions from 80+ different authors on 2008+ variables. Citations to SCCS are now 95+/year and growing. 6 109 missing codes for 28 SCCS variables 1006-1115 will be coded for 28 SCCS societies on the world-system impacts variables partially coded in White and Burton’s (1985-1988) NSF 8507685 funded research on “World-Systems and Ethnological Theory.” 7 To bring the SCCS societies up to date for post-1965 societies, 30 well described post-1965 ethnographic cases will be added to an (expanded) eSCCS and coded for EA variables and the CDC Cultural Diversity Codebook of 180 SCCS variables. 8 Given that the SCC Sample was published in 1969, the eSCCS additions to the sample will bring it up to date temporally. This will allow study of world-system impacts on 37 well-described ethnographic cases in the contemporary post-war period.
A structurally endogamous kinship network core of a Turkish nomad clan (White and Johansen 2005: 379; 76-79).
Similarly, Wolf (1982) drills down at several hundred ethnographically data points to analyze how commodity exchange affected indigenous societies in the 1500-1980 period of overseas conquest and modern world-systems. Interactive maps provide for drilling down from a network at one level (network spread of disease not shown here) by clicking a node to see a more detailed map or a network within that node. The upper level nodes can be societies with organizations networks reached by a click of a given node. Fig. 1.A. Gmap of Cultural Survival (2010) 100+ recent trouble spot study cases: Gmaps extend to networks at the global level, clicking into cases at the local level. Live: http://bit.ly/c1funC Fig. 1.B. This google map tracks cases of swine flu in 2009, types of cases are color coded, fatal cases have no dot, clicking a region gives a more detailed map of cases within the region.