260 likes | 624 Views
Statistical Dependence. Petra Petrovics PhD Student. Statistical Dependence. Definition: Statistical dependence exists when the value of some variable is dependent upon or affected by the value of some other variable. Statistical Dependence Independent Functional
E N D
Statistical Dependence Petra Petrovics PhD Student
Statistical Dependence Definition: Statistical dependence exists when the value of some variable is dependent upon or affected by the value of some other variable. Statistical Dependence Independent Functional variables relation
Types of dependence • association – between two nominal data • Yule (Y) • Csuprov (T) • mixed – between a nominal and a ratio data • H; H2 • correlation – among ratio data
I. Association a) Yule-measure Where:f11, f10, f01, f00 the observed frequencies f1. , f0. , f.1 , f.0the marginal frequencies • Y = 0 the variables are independent • 0 Y 1statistical dependence • Y = 1 functional relation Only when the number of categories of both variables is two!
In case of statistical dependence: • If the variables are independent:
Example: • Suppose that a certain elective is offered to freshmen and sophomores on a pass-fail basis only. An advisor is interested in determining whether there is a relationship between the student’s grade and class standings. • Data for the test were obtained from last semester’s classes: Medium-strong dependence
b) Contingency table • there are s categories of the row/column variable: A1, A2, … , As • there are t categories of the row/column variable: B1, B2, … , Bt where s < t
The measure for statistical dependence in case of contingency table • T – measure, when s = t • C – measure, when s < t 0 C0,3 weak dependence 0,3 C 0,7 medium-strong dependence 0,7 C 1 strong dependence
Example A manufacturer of printed circuit boards has determined that boards classified as nonconforming nearly always have one of three defects: a component on the board is either missing, damaged or raised (installed improperly). The boards are produced on three machines (A, B and C). To determine whether there is a relationship between the type of nonconformity and the machine, a sample of 500 nonconforming boards was obtained:
Question: • Is the type of nonconformity related to the machine used for production? s=3 t=3 T-measure
Solution Medium-strong dependence
Mixed dependence Analysis of Variance • One-way analysis of variance is a technique used to compare means of two or more samples. • In case of a qualitative and a quantitative variable.
Differences - variances • djitotal difference: difference between an employee’s production and the grand mean • Bjiwithin-column difference: difference between an employee’s production and his group’s mean • Kjibetween-column difference: difference between the group’s mean and the grand mean
dji = Bji + Kj SS = SSB + SSK 2 = 2B + 2K
Measures of mixed dependence or Where: • H = H2 = 0 the variables are independent • H = H2 = 1 functional relation • 0 H 1 0 H 0,3 weak dependence 0,3 H 0,7 medium-strong dependence 0,7 H 1 strong dependence • 0 H2 1 Statistical dependence
Example Is thereanydependencebetweentheaveragemarks and faculties?
or or
Exercise 1 • The workers of a company are grouped according to their position and sex: • Is there a relationship between position and sex?
Exercise 2 • In a town doctors are grouped in the following way: • Find the type of statistical dependence and determine the strength of the relationship.
Exercise 3 Calculate H and H2-measures.
Exercise 4 • In a supermarket there was a survey among those who buy chips. 33 of the 100 persons who were asked bought Chio chips, one quarter of them bought Pom Bär, one tenth of them bought Lay’s. They spent 104 HUF on the average on chips. They spent 98 HUF on Chio chips and 74 HUF on Cerbona on the average. The 20 persons who bought Chee-tos all chose the same chips on sale which cost 120 HUF. The standard deviation of the money spent on Chio is 23 HUF, spent on Pom-Bär is 30 HUF, spent on Lay’s is 8 HUF, i.e. 13.56% and in case of Cerbona it’s 14.86%. • Create a table using these data and fill in the gaps. • Determine the strength of relationship between the type of the chips and the money spent on them.
Exercise 5 • In a shoe-factory the relationship between the sex and the education of the 2,500 employees was examined. • 60% of the workers is man, 16% of the men has university degree and 24% of them has primary qualification. Half of those who has primary education is man, for those who has secondary education the principle of indipendence is realized. • Fill in the following table and determine the relationship between sex and qualification!
Exercise 6 • The following table shows the distribution of workers in a company: • What can you say about the strength of the relationship between sex and position?
Exercise 7 • The tourists of hotels in 1996 were groupped in the following way: • Is the type of tourists related to the type of hotels?