330 likes | 775 Views
Multidimensional Scaling. Agenda. Multidimensional Scaling Goodness of fit measures Nosofsky, 1986. Proximities. p Amherst, Hadley. Configuration (in 2-D). x i. Configuration (in 1-D). Formal MDS Definition. f: p ij d ij ( X )
E N D
Agenda • Multidimensional Scaling • Goodness of fit measures • Nosofsky, 1986
Proximities pAmherst, Hadley
Formal MDS Definition • f: pijdij(X) • MDS is a mapping from proximities to corresponding distances in MDS space. • After a transformation f, the proximities are equal to distances in X.
Distances, dij dAmherst, Hadley(X)
Distances, dij dAmherst, Hadley(X)=4.32
Proximities and Distances Proximities Distances
The Role of f • f relates the proximities to the distances. • f(pij)=dij(X)
The Role of f • f can be linear, exponential, etc. • In psychological data, f is usually assumed any monotonic function. • That is, if pij<pklthen dij(X)dkl(X). • Most psychological data is on an ordinal scale, e.g., rating scales.
Looking at Ordinal Relations Proximities Distances
Stress • It is not always possible to perfectly satisfy this mapping. • Stress is a measure of how closely the model came. • Stress is essentially the scaled sum of squared error between f(pij) and dij(X)
Stress “Correct” Dimensionality Stress Dimensions
Distance Invariant Transformations • Scaling (All X doubled in size (or flipped)) • Rotatation (X rotated 20 degrees left) • Translation (X moved 2 to the right)
Uses of MDS • Visually look for structure in data. • Discover the dimensions that underlie data. • Psychological model that explains similarity judgments in terms of distance in MDS space.
Simple Goodness of Fit Measures • Sum-of-squared error (SSE) • Chi-Square • Proportion of variance accounted for (PVAF) • R2 • Maximum likelihood (ML)
Proportion of Variance Accounted for (SST-SSE)/SST = (34-7.96)/34 = .77
R2 • R2 is PVAF, but… (SST-SSE)/SST = (34-44.03)/34 = -0.295
Maximum Likelihood • Assume we are sampling from a population with probability f(Y; ). • The Y is an observation and the are the model parameters. =[0] Y N(-1.7; [=0])=0.094
Maximum Likelihood • With independent observations, Y1…Yn, the joint probability of the sample observations is: =[0] Y1 Y2 Y3 0.094 x 0.2661 x .3605 = .0090
Maximum Likelihood • Expressed as a function of the parameters, we have the likelihood function: • The goal is to maximize L with respect to the parameters, .
Maximum Likelihood =[0] Y1 Y2 Y3 0.094 x 0.2661 x .3605 = .0090 (Assuming =1) =[-1.0167] Y1 Y2 Y3 0.3159 x 0.3962 x .3398 = .0425
Maximum Likelihood • Preferred to other methods • Has very nice mathematical properties. • Easier to interpret. • We’ll see specifics in a few weeks. • Often harder (or impossible?) to calculate than other methods. • Often presented as log likelihood, ln(ML). • Easier to compute (sums, not products). • Better numerical resolution. • Sometimes equivalent to other methods. • E.g., same as SSE when calculating mean of a distribution.