110 likes | 297 Views
CSCI N207 Data Analysis Using Spreadsheet. 11. Multivariate Analysis. Lingma Acheson linglu@iupui.edu. Department of Computer and Information Science, IUPUI. Multivariate Data Analysis. Univariate data analysis concerned itself with describing an entity using a single variable.
E N D
CSCI N207 Data Analysis Using Spreadsheet 11. Multivariate Analysis Lingma Acheson linglu@iupui.edu Department of Computer and Information Science, IUPUI
Multivariate Data Analysis • Univariate data analysis concerned itself with describing an entity using a single variable. • Multivariate data analysis tries to establish a mathematical relationship between multiple data sets. • smoking/cancer • salary/productivity • temperature/chirps in 15 seconds
Correlation • Multivariate data analysis depends largely on correlation. • Correlation is a mathematical tool used to establish a dependency between two variables. • Researchers use Pearson's Correlation Coefficient to represent correlation, signified by R:
Review • Variance: One measure of dispersion (deviation from the mean) of a data set. The larger the variance, the greater is the average deviation of each datum from the average value. • Standard deviation: Square root of the variance. The magnitude of the number is more in line with the values in the data set. Variance = Average value of the data set StandardDeviation=
Measuring Correlation • R: Values range between -1 (perfect negative or inverse correlation) and +1 (perfect positive correlation). • A positive correlation (+) reflects a situation where an increase in value of one variable accompanies an increase in the value of the second variable. An R value of +1 is called "perfect positive." • A negative correlation (-) reflects a situation where an increase in value of one variable accompanies an decrease in the value of the second variable (inverse correlation). An R value of -1 is called "perfect negative."
Measuring Correlation • This measurement applies only to linear systems. • Excel Covariance Function:=COVAR(Range1, Range2) • Excel Correlation Function:=CORREL(Range1, Range2)
Magnitude of Association • Although interpretation is discipline specific, we can generally draw the following strengths for |R|,where -1 <= R <= 1: