Factor Analysis

Factor Analysis Introduction to concept Reading = Individual differences by Colin Cooper

Common Terms Used

Factor Analysis - Intro • Data reduction - identifies parts of data set which potentially measure the same thing. • Commonly encountered through identification of personality dimensions. • Hundreds of questions relating to components of personality are complied • Do you enjoy socialising with different people at parties? • Do you worry a lot? • Do you enjoy trying out new things? • Do you get upset very easily? • Questions consistently responded to in similar manner by different respondents supposedly address the same underlying construct or ‘Common Factor’ - e.g., Extraversion-Introversion or Neuroticism.

Factor Analysis - Intro • Most data, generated from responses, can be exposed to FA and therefore it is not limited to questionnaires e.g., a series of physical tests may have as their essence one or two core skills. • Is arguably the most abused statistical technique used. Generates much controversy and the treatment here is very simplified. • Are many different types - simplest form described here - others identified in due course

Answer 1 for strongly agree and 5 for strongly disagree Q1 I enjoy socialising 1 2 3 4 5 Q2 I often act on impulse 1 2 3 4 5 Q3 I am a cheerful sort of person 1 2 3 4 5 Q4 I often feel depressed 1 2 3 4 5 Q5 I have difficulty getting to sleep 1 2 3 4 5 Q6 Large crowds make me feel anxious 1 2 3 4 5 Q1 Q2 Q3 Q4 Q5 Q6 Stephen 5 5 4 1 1 2 Ann 1 2 1 1 1 2 Paul 3 4 3 4 5 4 Janette 4 4 3 1 2 1 Michael 3 3 4 1 2 2 Christine 3 3 3 5 4 5 Identifying No. of factors - inspection of responses Question Sample Response Sample

Tentative inferences • Responses to Q1-3 and Q4-6 were very similar. Suggests these questions are addressing the same common factor. • Made easy by the fact • related question items were positioned side by side • were very few participants. In normal situations this would be impossible • Usually a correlation matrix is required to identify which items are related to each other.

Mentioned 2 slides later Correlation matrix Q1 Q2 Q3 Q4 Q5 Q6 Q1 1 Q2 .933 1 Q3 .824 .696 1 Q4 -.096-.052 0 1 Q5 -.005 .058 .111 .896 1 Q6 -.167 -.127 0 .965.808 1 Corrleation matrix depicting the correlations between the six items given the responses previously documented.

Interpretation • Q1-3 correlate strongly with each other and hardly at all with 4-6 indicating 2 common factors. • This would not be typical. • Correlations here are artificially large - in real life would rarely be in excess of .5 - typically be between .2-.3. Would make it very difficult to establish a pattern by eye. • With more items there would be a greater number of correlations to observe. • 6 items produced 15 correlations. 40 items would produce 780 items - N(N-1)/2.

Q4 Q6 hypotenuse adjacent Representing FA through geometry • Items or factors can be represented by straight lines of equal length. • Lines are positioned such that the correlation between the items = cosine of the angle. cosine of angle = hypotenuse/ adjacent • correlation =.97, cosine=.97, angle = 15

Interpreting angles F1 F2 F3 F1- F2 =15, r= .97 F1- F3 =105, r= -.26 • F1 - F4 =165, r= -.97 • F1 - F5 =285, r= .26 • Factors/items above horizontal are positively correlated to F1 • Factors/items at right angles to F1 have zero correlation to it • Factors/items at 180 have a perfect negative correlation

F1 I2 I3 I1 I4 I5 F2 I6 Combining Factors and Items • Roughly orthogonal solution for the items described previously.

Possible relationships between Common Factors • Orthogonal solution - when two common factors are extracted which are not themselves correlated i.e., they are at right angles to each other. Is preferable since if the common factors are not correlated they truly represent independent factors. • Oblique solution - the common factors extracted may themselves may be correlated.

Essential FA output & associated statistical concepts • Factor (structure) matrix - table showing the correlations between all the items and the factors. By convention factors are shown as columns. • Factor loading - correlation between an item and a factor NB this is different to the correlation matrix

Factor matrix shows 3 things/ 1&2 • Which items make up which common factor • Convention dictates that an item only contributes to a factor if the correlation is greater than ±.3 • Revels amount of overlap between each item and all the factors • square of correlation indicates the common variance between item and factor. Sum these squared correlations = communality of item • For I1= .92 + .12 = .82 • Communality for an item may be low because • measures something conceptually different from all the other items • Has excessive measurement error • Are few individual differences in the way the item is responded to - may be very easy or very difficult

Factor matrix shows 3 things/ 3 • Indicates the relative importance of each common factor i.e., A factor that for example explains 40% of the overlap between the items will be more important than one that only explains 25%. • Calculated through an eigenvalue. • Square the factor loadings for a single factor, add them up = the eigenvalue. • Divide the eigenvalue by the number of items - proportion of variance which is explained by that factor.

Calculating for Factor 1 • eigenvalue = .92 + .982 + .92 + .12 + 02 + -.12 =2.6 • Variance explained by factor 1= 2.6/6= 43%

Additional observation • Indicates possibility that some of the variance may be unexplained by the factors. Possible explanations: • Factors are an approximation - some of the original information is sacrificed during this process. The 2 different methods of EFA make different assumptions about the possibility of unexplained variance.

Principal Components Analysis vs. Principle Axis Factoring • Both are examples of Exploratory FA but are distinguished by assumption regarding the possibility of unexplained variance • PCA - all item variance can be explained by the factors. All items will have a communality of 1 and the factors will, between them, account for 100% of the variation among the items. • Total variance = common factor variance + measurement error

PAF - items may have ‘unique variance’ - variance which cannot be explained by factors • Suppose there are two test items: What is the capital of Italy? What is the capital of Spain? • Lets assume that they are of the same level of difficulty and therefore test the underlying factor (geographical knowledge) to the same degree. Will these items always be responded to in the exact same way? • Someone may have a poor level of Geographical knowledge but just happen to know the capital of Spain. It is therefore not possible to consider the two items as being completely equivalent • Correct response depends on knowledge relating to • common factor (geographical knowledge) • something unique to the individual item - Specific Variance - cannot be predicted from the common factors

Total variance = common factor variance + specific item variance + measurement error • PAF is more complicated because it must determine how much of the variance relating to an item is ‘common-factor’ variance and how much is ‘specific variance’. • PCA does not allow for the possibility of Item specific variance

PCA or PAF? • Seem to produce very similar results so much so that some researchers do not identify which one they are carrying out. • Since PAF allows for specific variance then an item’s communality is necessarily going to be less than one • Loading factors for items are going to appear less impressive with PAF as opposed to PCA.

ego strength, self-actualisation, locus of control neuroticism, anxiety, hysteria Used for 4 basic purposes /1 • Shows how many distinct common factors are measured by a set of test items • Are the supposed different constructs: neuroticism, anxiety, hysteria, ego strength, self-actualisation, and locus of control, 6 independent entities or would they be better described as only 2 factors? ‘Elements of Pathology’ ‘Healthy mechanisms’

Used for 4 basic purposes /2&3 • Shows which items relate to which common factors • from previous example neuroticism belonged to the factor ‘Elements of Pathology’ • Determines whether tests that purportedly measure the same thing in fact do so • 3 tests that claim to measure anxiety. FA may produce more than one factor indicating something in addition to anxiety is being measured

Used for 4 basic purposes /4 • Checks the psychometric properties of questionnaire - with a different sample do the same factors materialise? • Would a different population made up of Native American Indians identify the constructs of extraversion-introversion & Neuroticism which have been found in European cultures?

Factor Analysis

Factor Analysis

Presentation Transcript

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

FACTOR ANALYSIS

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

FACTOR ANALYSIS

Factor Analysis

Factor Analysis:

Factor Analysis

Factor Analysis

FACTOR ANALYSIS

Factor Analysis