330 likes | 442 Views
A new family of regular semivalues and applications. Roberto Lucchetti Politecnico di Milano,Italy. Main goal : To rank genes from DNA data provided by Microarray Analysis. Tools : Cooperative Game Theory , in particular Power indices
E N D
A new family of regular semivalues and applications Roberto Lucchetti Politecnico di Milano,Italy
Main goal: • To rank genes from DNA data provided by Microarray Analysis. • Tools: • Cooperative Game Theory, in particular Power indices • Power indices rank players according to their “strength” in the game. In the EU council the strongest states (GE,FR,IT,UK) have a some 10 times power w.r.t. the weakest state (MT) In UN the veto players have a some 100 (10)times power w.r.t. non permanent players, according to Shapley (Banzhaf). R.Lucchetti Politecnico di Milano 2
A (TU) game is with N={1,…,n} is the set of players, v is the characteristic function of the game. A N is called coalition. v(A) is theutility (orcost) for the coalition A. GN represents the set of all games having N as set of players. Remark: GNR2n-1 R.Lucchetti Politecnico di Milano 3
A Base for GN: Unanimity games Subclass of games: • Simple games. Among them the weighted majority games: R.Lucchetti Politecnico di Milano 4
Introduction: how an array works A chip can contain millions of DNA probes
Introduction: how a microarray works Hybridization When a single DNA helix meets a single mRNA helix, if they are complementary they will stick to each other. Hybridization helps researchers to identify what RNA sequences are present in a sample and this tells them what genes are being expressed by the organism and how much they are being expressed.
Introduction: how a microarray works DNA/RNA T Adenine (A) C Guanine (G) A Thyimine (T)/Uracil (U) Cytosine (C) G GeneChip microarrays use the natural chemical attraction between the RNA target (from the sample preparation) and the DNA on the array to determine the expression level of a given gene.
Introduction: how a microarray works The RNA extract from a sample is copied in cRNA (through a process known as PCR). Copying the RNA allows it to be more easily detected on the array. At the same time the RNA is copied, a chemical flourescent molecule called biotin is attached to the strand. This molecule will show where the sample RNA has stuck to the DNA probe on the array.
Introduction: how a microarray works If the gene is highly expressed,many RNA molecules will stick to the probe and the probe location will shine brightly when the laser hit it. If the sample RNA doesn’t match it will be rejected by the probe on the array and when the laser hits the probe, nothing glows.
Introduction: how a microarray works The whole point of microarray gene expression analysis is to compare expression levels among different samples. Let’s simplify the situation with an example in which we have four genes and two samples. Gene1: 2RUDE Gene2: 2LOUD Gene3: GETOUT Gene4: FATMET Gene4 is not glowing.
… Array1 Array2 Array3 Expression level of gene 4 in array 2
The Microarray Game: An mxnBoolean matrixM such that Given the column , supp R.Lucchetti Politecnico di Milano 12
A power indexfor the game (N,v) is (x1,…,xn) such that: xirepresents the power of player i in game v. • weighted voting does not work… • The most famous: • Shapley () and Banzhaf () . R.Lucchetti Politecnico di Milano 14
Shapley () and Banzhaf() the marginal contribution ofito S {i} R.Lucchetti Politecnico di Milano 15
is a probabilistic value if there is a probability on such that • Shapley • Banzhaf R.Lucchetti Politecnico di Milano 16
If pi(S)=p(|S|)>0, the probabilistic value is called regular semivalue Examples: BanzhafShapleyp-binomial Regular semivalues are points in the simplex: R.Lucchetti Politecnico di Milano 17
Properties for power indices Let The solutionhas the dummy player (DP)property, if for each player such that for all coalitions A not containing i, R.Lucchetti Politecnico di Milano 18
Letbe a permutation. Given the gamev, denotebythe game and by The solutionhas the symmetry (S)property if, for each permutation as above R.Lucchetti Politecnico di Milano 19
The new family of power indices Let Defineon the unanimitygameas and extend it by linearity on a generic R.Lucchetti Politecnico di Milano 20
Theorem 1 There exists one and only one value fulfilling the symmetry, linearity and dummy player properties, and assigning aS to all non null players in the unanimity game uS , where a1=1 and as>0 for s=2,…,n. fulfills the formula: R.Lucchetti Politecnico di Milano 24
Theorem 2 ais a regular semivalue for all a>0. 2fulfills the formula: • Corollary The family of the weighting coefficients of the values a, a>0, is an open curve in the simplex of the regular semivalues, containing the Shapley value. The addition of the Banzhaf value to the curve provides a one-point compactification of the curve. R.Lucchetti Politecnico di Milano 25
Theorem 3 study of the term: • Key tool Let , let Then Moreover, for all natural l, and positive real a,x: Finally, for each natural m, the following formula holds: R.Lucchetti Politecnico di Milano 26
Calculating the indices in weighted majority games Let count in how many ways the sum of the weights of j players different from i can give k. Then the following proposition holds. Let be the value defined in the theorem above. Let q>0 be a positive integer, and let w1,…,wn be non negative integers. Let v=[q;w1,…,wn] be the associated weighted majority game. Then the following formula holds: An efficient algorithm based on generating functions and formal series allows for a fast calculation of the coefficients R.Lucchetti Politecnico di Milano 27
Applications The EU R.Lucchetti Politecnico di Milano 28
The powerindices, when considering the 56 genes common to the indices, among the first 100 common to all indices. Data from 40 tumor samples vs 22 normal, 2000 genes R.Lucchetti Politecnico di Milano 30
Data from a Colon Rectal Cancer 10 Healthy 12 Tumoral tissues • An extended microarray game considers also how much the genes are abnormally expressed w.r.t a normality interval. • Given the normality interval [mi,Mi] of the gene i, si the standard deviation, Nki=[mi-ksi,mi+ksi], assign k to the ij cell of the matrix if value of gene i in patient j falls in Nik \ Nik-1 • A weighted Shapley value is used to rank genes. This allows better differentiating the genes. Taking the first 100 genes in the ranking, the game is formed as an average of weighted majority games. • Then we calculate the Shapley, Banzhaf and 2indices R.Lucchetti Politecnico di Milano 31
Gene expression analysis was performed by using Human Genome U133A-Plus 2.0 GeneChip arrays (Affymetrix, Inc., Calif). • The following 7 genes are quoted in medical literature as having great importance in the onset of the disease: CYR61, UCHL1, FOS,FOSB, EGR1, VIP, KRT24. • One of them was ranked around the 100-th position by the weighted Shapley value. All other ones are among the first 50 and played the subsequent game. R.Lucchetti Politecnico di Milano 32
References • R.Lucchetti P.Radrizzani, E. Munarini, A new family of regular semivalues and applications,Int.J.of Game TheoryDOI 10.1007/s00182-010-0263-5 • R. Lucchetti-S. Moretti-F. Patrone-P.Radrizzani, The Shapley and Banzhaf indices in microarray games, Computers and Operations Research, 37, (2010) p.1406-1412. • R. Lucchetti-P.Radrizzani, Microarray Data Analysis Via Weighted Indices and Weighted Majority Games, Computational Intelligent Methods for Bioinformatics and Biostatistics II, Masulli, Peterson, Tagliaferri (Eds), Lecture Notes in Computer Science, Springer (2010) p.179-190. • S.Moretti , F.Patrone, S.Bonassi, The class of microarray games and the relevance index for genes. TOP15 (2007), p256-280. • D. Albino, P. Scaruffi, S. Moretti, S.Coco, C.Di Cristofano, A.Cavazzana, M.Truini, S.Stigliani, S.Bonassi, G.Ptonini (2008): Stroma poor and stroma rich gene signatures show a low intratumoral gene expression heterogeneity in Neuroblastic tumors. Cancer113, p. 1412-1422. R.Lucchetti Politecnico di Milano 33