280 likes | 419 Views
Disease Gene Candidate Prioritization by Integrative Biology Table of contents:. Background Networks – deducing functional relationships from PPI data networks Protein interaction networks Functional modules / network clusters Phenotype association
E N D
Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Background Networks – deducing functional relationships from PPI data networks Protein interaction networks Functional modules / network clusters Phenotype association Grouping disorders based on their phenotype. Biological implications of phenotype clusters. Method and examples Integrating protein interaction data and phenotype associations in an automated large scale disease gene finding platform
Background Finding genes responsible for major genetic disorders can lead to diagnostics, potential drug targets, treatments and large amounts of information about molecular cell biology in general.
1q21-1q23.1 chr1:141,600,00-155,900,000 Background Methods for disease gene finding post genome era (>2001): Mircodeletions Translocations Linkage analysis http://www.rscbayarea.com/images/reciprocal_translocation.gif Fagerheim et al 1996. http://www.med.cmu.ac.th/dept/pediatrics/06-interest-cases/ic-39/case39.html
Background Automated methods for disease gene finding int the post genome era (>2001): ? Grouping: Tissues, Gene Ontology, Gene Expression, MeSH terms ……. (Perez-Iratxeta, Bork et al. 2002) (Freudenberg and Propping 2002) (van Driel, Cuelenaere et al. 2005) (Hristovski, Peterlin et al. 2005)
Disease Gene Finding. Summery Background Why do we want to find disease genes, how has it been done until now? Networks – deducing functional relationships from network theory Protein interactionnetworks Functional modules / network clusters Phenotype association Grouping disorders based on their phenotype. Biological implications of phenotype clusters. Method and examples Combining network theory and phenotype associations in an automated large scale disease gene finding platform proof of concept. Status of pipeline / infrastructure
Networks and functional modules Deducing functional relationships from protein interaction networks
daily weekly monthly Networks Social Networks, The CBS interactome (de Licthenberg et al.)
daily weekly monthly Networks Social Networks, The CBS interactome (de Licthenberg et al.)
Networks Protein interaction networks of physical interactions. (Barabasi and Oltvai 2004).
Dynamic funcional module: Eg: Cell cycle regulation Metabolism InWeb Homo Sapiens Extracting functional data from protein interaction networks The Ach receptor involved in Myasthenic Syndrome.
Trans-organism protein interaction network Orthologs? Orthologous genes are direct descendants of a gene in a common ancestor: S.Cerevisiae D. Melanogaster H.Sapiens (O'Brien K, Remm et al. 2005)
Trans-organism protein interaction network H.Sapiens MOSAIC D. Melanogaster Experim. C. Elegans Experim. S. Cerevisiae Experim.
BIND IntAct DIP Web server Opis MINT HPRD Command line Inweb.pl GRID Hand-curated sets PPI – pred. Infrastructure status Extraction perl modules Direct SQL access XML or SIF output Trans-organism ppi pipeline InWeb Homo Sapiens >122.000 int. > 22.000 genes Scoring Topological No publ. CBS Datawarehouse Download/reformat db’s
Protein interaction networks scoring the interactions Number of methods that have shown the same interaction Number of independent studies that have shown the same interaction Number of common interaction partners Cluster issues Large scale / small scale issues
Disease Gene Finding. Summery Background Why do we want to find disease genes, how has it been done until now? Networks – deducing functional relationships from network theory Protein interactionnetworks Functional modules / network clusters Phenotype association Grouping disorders based on their phenotype. Biological implications of phenotype clusters. Method and examples Combining network theory and phenotype associations in an automated large scale disease gene finding platform proof of concept. Status of pipeline / infrastructure
Phenotype association Zelwegger syndrome Epicanthal folds Flat facies Flat occiput Glaucoma High arched palate High forehead Hypertelorism Large fontanelles Macrocephaly Micrognathia Nystagmus Pale optic disk Pigmentary retinopathy Posteriorly rotated ears Protruding tongue Redundant skin folds of neck Round facies Sensorineural deafness Turribrachycephaly Upward slanting Hyporeflexia or areflexia Hypotonia palpebral fissures Autosomal recessive Albuminuria Aminoaciduria Decreased dihydroxyacetone phosphate acyltransferase (DHAP-AT) activity Decreased plasmologen Elevated long chain fatty acids Elevated serum iron and iron binding capacity Increased phytanic acid Pipecolic acidemia Breech presentation Death usually in first year of life Genetic heterogeneity Infants occasionally mistaken as having Down syndrome Agenesis/hypoplasic corpus collosum Polymicrogyria Seizures Severe mental retardation Subependymal cysts Pulmonary hypoplasia Cubitus valgus Delayed bone age Metatarsus adductus Rocker-bottom feet Stippled epiphyses (especially patellar and acetabular regions) Talipes equinovarus Transverse palmar crease Ulnar deviation of hands Wide cranial sutures Transverse palmar crease Heterotopias/abnormal migration Hypoplastic olfactory lobes Absent liver peroxisomes Hepatomegaly Intrahepatic biliary dysgenesis Prolonged neonatal jaundice Pyloric hypertrophy Patent ductus arteriosus Ventricular septal defects Bell-shaped thorax Small adrenal glands Absent renal peroxisomes Clitoromegaly Cryptorchidism Hydronephrosis Hypospadias Renal cortical microcysts Failure to thrive Abnormal electroretinogram Abnormal helices Anteverted nares Brushfield spots Cataracts Corneal clouding
Phenotype association Word vectors Reference : Zelwegger Syndrome (214100) 214100 202370 A relationship between the infantile form of Refsum disease and Zellweger syndrome was suggested by the observations of Poulos et al. (1984) in 2 patients. In the infantile form of Refsum disease, as in Zellweger syndrome, peroxisomes are deficient and peroxisomal functions are impaired (Schram et al., 1986). Clinically, infantile Refsum disease, ZWS, and adreno-leukodystrophy have several overlapping features. (Stokke et al., 1984). (http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=266510)
Word vectors Phenotype association Phenotype association network Zelwegger Adrenoleuko -dystrophy Cerebro- Hepato- renal Refsum
Disease Gene Finding. Summery Background Why do we want to find disease genes, how has it been done until now? Networks – deducing functional relationships from network theory Protein interactionnetworks Functional modules / network clusters Phenotype association Grouping disorders based on their phenotype. Biological implications of phenotype clusters. Method and examples Combining network theory and phenotype associations in an automated large scale disease gene finding platform proof of concept.
Method – Proof of concept
Word vectors Phenotype clustering InWeb Homo Sapiens Method
Results - Benchmark MIM RANK GENE Probability TRUE 278800 1 ENSG00000032514 0.300326793109544 * 278800 2 ENSG00000188611 0.0125655342047565 278800 2 ENSG00000138297 0.0125655342047565 278800 2 ENSG00000165406 0.0125655342047565 278800 3 ENSG00000196693 0.0121357313793756 278800 3 ENSG00000185532 0.0121357313793756 278800 4 ENSG00000197910 0.00680983722337082 278800 4 ENSG00000165383 0.00680983722337082 278800 4 ENSG00000172538 0.00680983722337082 . . . . . . . . . . . . . . . . . . . . 278800 4 ENSG00000165511 0.00680983722337082 278800 4 ENSG00000182354 0.00680983722337082 278800 4 ENSG00000172661 0.00680983722337082 278800 4 ENSG00000165507 0.00680983722337082 278800 4 ENSG00000178440 0.00680983722337082 278800 4 ENSG00000138299 0.00680983722337082 278800 4 ENSG00000197704 0.00680983722337082 278800 4 ENSG00000012779 0.00680983722337082 278800 4 ENSG00000197354 0.00680983722337082 278800 4 ENSG00000189090 0.00680983722337082 278800 4 ENSG00000107551 0.00680983722337082 278800 4 ENSG00000126542 0.00680983722337082 278800 4 ENSG00000198364 0.00680983722337082 278800 4 ENSG00000185849 0.00680983722337082 278800 4 ENSG00000150165 0.00680983722337082 278800 4 ENSG00000128815 0.00680983722337082 278800 4 ENSG00000178645 0.00680983722337082 278800 4 ENSG00000138293 0.00680983722337082 278800 4 ENSG00000176833 0.00680983722337082 278800 4 ENSG00000179251 0.00680983722337082 278800 4 ENSG00000169826 0.00680983722337082 278800 4 ENSG00000172678 0.00680983722337082 278800 4 ENSG00000197752 0.00680983722337082 278800 5 ENSG00000107643 0.00412573091718715 278800 6 ENSG00000165733 0.000263885640603109 278800 7 ENSG00000169813 6,63E+07 DE SANCTIS-CACCHIONE SYNDROME Gene map locus 10q11 >12MB area, 103 ranked genes CLINICAL FEATURES De Sanctis and Cacchione (1932) reported a condition, which they called 'xerodermic idiocy,' in which patients had xeroderma pigmentosum, mental deficiency, progressive neurologic deterioration, dwarfism, and gonadal hypoplasia. http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=278800
DNA excision repair protein ERCC-6 DNA excision repair protein ERCC-2 Eukaryotic initiation factor 4A-I (eIF4A-I) Eukaryotic translation initiation factor 4E (eIF4E) Results – Benchmarking DE SANCTIS-CACCHIONE SYNDROME Ranked 1 Probability: 0.300326793109544 #278800 DE SANCTIS-CACCHIONE SYNDROME *126340 DNA REPAIR DEFECT EM9 OF CHINESE HAMSTER OVARY CELLS, COMPLEMENTATION OF; EM9 #133540 COCKAYNE SYNDROME CKN2 #278730 XERODERMA PIGMENTOSUM, COMPLEMENTATION GROUP D #601675 TRICHOTHIODYSTROPHY
Results – Benchmarking DE SANCTIS-CACCHIONE SYNDROME Ranked 2 Probability 0.0125655342047565
Disease Gene Finding. Summery Background Why do we want to find disease genes, how has it been done until now? Networks – deducing functional relationships from network theory Protein interactionnetworks Functional modules / network clusters Phenotype association Grouping disorders based on their phenotype. Biological implications of phenotype clusters. Method and examples Combining network theory and phenotype associations in an automated large scale disease gene finding platform proof of concept.
Acknowledgments Disease Gene Finding : Olga Rigina Olof Karlberg Zenia M. Størling Páll Ísólfur Ólason Kasper Lage Anders Gorm Anders Hinsby Yves Moreau Niels Tommerup Søren Brunak