350 likes | 431 Views
Liquid Association (LA). Two examples A challenge. Liquid Association (LA). LA is a generalized notion of association for describing certain kind of ternary relationship between variables in a system. (Li 2002 PNAS ). Green points represent four conditions for cellular state 1.
E N D
Liquid Association (LA) • Two examples • A challenge
Liquid Association (LA) • LA is a generalized notion of association for describing certain kind of ternary relationship between variables in a system. (Li 2002 PNAS) • Green points represent four conditions for cellular state 1. • Red points represent four conditions for cellular state 2. • Blue points represent the transit state between cellular states 1 and 2. • (X,Y) forms a LA. Profiles of genes X and Y are displayed in the above scatter plot. Important! Correlation between X and Y is 0
Mathematical Statistics on LA • EX=0, EY=0, SD(X)=SD(Y)=1 • LA is defined by following equation. g(Z) is the conditional expectation of the correlation between X and Y. LA(X,Y|Z) is the expected changes of the correlation between X and Y.
Stein Lemma • To compute E(g’(Z)) is not easy. With help from mathematical statistics theory, the LA(X,Y|Z) can be simplified as E(XYZ) when Z follows normal distribution. Stein lemma
Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
gene-expression data cond1 cond2 …….. condp gene1gene2 gene n x11 x12 …….. x1p x21 x22 …….. x2p … …
Correlation Coefficienthas been used by Gauss, Bravais, Edgeworth … Sweeping impact in data analysis is due to Galton(1822-1911) “Typical laws of heredity in man” Karl Pearsonmodifies and popularizes its use. A building block in multivariate analysis, of which clustering, classification, dimension reduction are recurrent themes
An application Two classes problem ALL (acute lymphoblastic leukemia) AML(acute myeloid leukemia)
Why clustering make sense biologically? The rationale is Genes with high degree ofexpression similarityare likely to befunctionally related. may form structural complex, may participate incommon pathways. may be co-regulated bycommon upstreamregulatory elements. Simply put, Profile similarity implies functional association
However, the converse is not true The expression profiles of majority of functionally associated genes are indeed uncorrelated • Microarray is too noisy • Biology is complex
Why no correlation? • Protein rarely works alone • Protein has multiple functions • Different biological processes or pathways have to be synchronized • Competing use of finite resources : metabolites, hormones, • Protein modification: Phosphorylation, proteolysis, shuttle, … Transcription factors serving both as activators and repressors
Transcription factors: proteins that bind to DNAActivator; repressors
Going subtle:Protein modification Histone inhibits transcription To activate transcription, the lysine side chain must be acetylated. Weaver(2001)
Corepressor : histone deacetylase Thyroid hormone Coactivator: Histone acetyltransferase
Math. Modeling : a nightmare Current Next mRNA F I T N E S S mRNA Observed mRNA protein kinase hidden ATP, GTP, cAMP, etc Cytoplasm Nucleus Mitochondria Vacuolar localization F U N C T I O N Statistical methods become useful DNA methylation, chromatin structure Nutrients- carbon, nitrogen sources Temperature Water
What is LA? PLA? Concept of “mediator”
Example 1. Positive-to-negative • X=ARP4,Y=LAS17, Z=MCM1 • Corr =0 in each plot • For low Z (marked points in A), X and Y are coexpressed • (B). For high Z (marked points in B), X and Y are contra-expressed Arp4 Protein that interacts with core histones, member of the NuA4 histone acetyltransferase complex; actin related protein Las17 Component of the cortical actin cytoskeleton
Example 2 -Negative to Positive • X=QCR9, Y= ROX1, Z=MCM1 • Corr=0 in each plot • For low Z (marked points in A), X and Y are contra-expressed • (B). For high Z (marked points in B), X and Y are co-expressed Rox1 Heme-dependent transcriptional repressor of hypoxic genes including CYC7(iso-2-cytochrome c ) and ANB1(translation initiation, ribosome) Qcr9 Ubiquinol cytochrome c reductase subunit 9
A Challenge • What genes behave like that ? • Can we identify all of them ? • N=5878 ORFs • N choose 3 = 33.8 billion triplets to inspect
Statistical theory for LA • X, Y, Z random variables with mean 0 and variance 1 • Corr(X,Y)=E(XY)=E(E(XY|Z))=Eg(Z) • g(z) an ideal summary of association pattern between X and Y when Z =z • g’(z)=derivative of g(z) • Definition. The LA of X and Y with respect to Z is LA(X,Y|Z)= Eg’(Z)
Statistical theory-LA • Theorem. If Z is standard normal, then LA(X,Y|Z)=E(XYZ) • Proof. By Stein’s Lemma : Eg’(Z)=Eg(Z)Z • =E(E(XY|Z)Z)=E(XYZ) • Additional math. properties: • bounded by third moment • =0, if jointly normal • transformation
Normality ? • Convert each gene expression profile by taking normal score transformation • LA(X,Y|Z) = average of triplet product of three gene profiles: (x1y1z1 + x2y2z2 + …. ) / n
How does LA work in yeast? Urea cycle/arginine biosynthesis
Yeast Cell Cycle(adapted from Molecular Cell Biology, Darnell et al) Most visible event
ARG1 Glutamate ARG2
ARG1 Glutamate ARG2
ARG1 8th place negative Y Head X Compute LA(X,Y|Z) for all Z Backdoor Rank and find leading genes Adapted from KEGG
Why negative LA?high CPA2 : signal for arginine demand. up-regulation of ARG2 concomitant with down-regulation of CAR2 prevents ornithine from leaving the urea cycle.When the demand is relieved, CPA2 is lowered, CAR2 is up-regulated, opening up the channel for orinthine to leave the urea cycle.
Other examples (see Li 2002) • X=GLN3(transcription factor), Y=CAR1, Z=ARG4 (8th place negative end) • Electron transport: X=CYT1(cytochome c1), gives ATP1 (11 times), ATP5 (subunits of ATPase) • Calmodulin CMD1, NUF1 (binding target of CMD1), CMK1(calmodulin-regulated kinase), YGL149W • Glycolysis genes PFK1, PFK2 (6-phospho-fructokinase) • CYR1(adenylate cyclase) , GSY1 (glycogen synthase), GLC2( glucan branching), SCH9(serine/threonine protein kinase; longevity)
SCH9 • Protein kinase that regulates signal transduction activity and G1 progression, controls cAPK activity, required for nitrogen activation of the FGM pathway, involved in life span regulation, homologous to mammalian Akt/PKB (SGD summary) • Science. 2001 Apr 13;292(5515):288-90. Regulation of longevity and stress resistance by Sch9 in yeast.Fabrizio P, Pozza F, Pletcher SD, Gendron CM, Longo VD. • The protein kinase Akt/protein kinase B (PKB) is implicated in insulin signaling in mammals and functions in a pathway that regulates longevity and stress resistance in Caenorhabditis elegans. We screened for long-lived mutants in nondividing yeast Saccharomyces cerevisiae and identified mutations in adenylate cyclase and SCH9, which is homologous to Akt/PKB, that increase resistance to oxidants and extend life-span by up to threefold. Stress-resistance transcription factors Msn2/Msn4 and protein kinase Rim15 were required for this life-span extension. These results indicate that longevity is associated with increased investment in maintenance and show that highly conserved genes play similar roles in life-span regulation in S. cerevisiae and higher eukaryotes.
Blue : low SCH9 • Red: high SCH9