220 likes | 236 Views
The FlyBase Consortium: Harvard University University of Bloomington-Indiana University of California-Berkeley Cambridge University. FlyBase includes 78,804 named alleles of 19,053 Drosophilid genes C LASSICAL A LLELES 61,286 e.g. sty S73 (EMS) G ENETICALLY E NGINEERED A LLELES
E N D
The FlyBase Consortium: Harvard University University of Bloomington-Indiana University of California-Berkeley Cambridge University Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
FlyBase includes 78,804 named alleles of 19,053 Drosophilid genes CLASSICAL ALLELES 61,286 e.g. styS73 (EMS) GENETICALLY ENGINEERED ALLELES 17,518 e.g. styScer\UAS.cHa Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Core vocabularies: Phenotypic class (fbcv.obo) Anatomy (fly_anatomy.obo, GO) Phenotypes are associated with alleles using controlled vocabularies in conjunction with controlled syntax Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Phenotypic class: neuroanatomy defective | recessive • Phenotype manifest in: chordotonal organ • Phenotype manifest in: midline glial cell The FlyBase record for styD5 (FBal0086316)includes: Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Genetic interaction: suppressor | dominant, visible { argosGMR.PC } Genetic interaction: suppressor | dominant, eye { argosGMR.PC } Genetic interaction: enhancer | dominant, visible { Shs.sev } Genetic interaction: enhancer | dominant, eye { Shs.sev } Genetic interactions…for example Casci et al., 1999 used double mutant analysis to place sty in the EGF receptor signaling pathway(GO:0007173). Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Numbers … Single mutant phenotype statements: Phenotypic class: 86,153 Anatomy: 81,438 Total phenotypic data lines including free text: 397,821 Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
more … Genetic interaction statements: Phenotypic class: 35,825 Anatomy: 39,686 Total multiple mutant phenotype lines including free text: 131,208 Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
The Phenotype Ontology idea so far: Observable Anatomies GO Process Function Cell component Cell Type Chemical metabolite Behavior Temporal/developmental/life cycle Attribute Value Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
The Experiment: “9) Now each MOD should try to annotate 50 objects using PATO, and swop those around … . . . Everyone do “50 units” … include aspects of assay, etc/” Judy Blake May 2003 meeting’s Mins. Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
The Questions: 1: Can the Phenotype Ontology represent all the controlled data thatFlyBase allele records include? 2: Can the Phenotype Ontology reduce the amount of free text information in FlyBase allele records? 3: Is the proposed system efficient enough for curators for production database use? 4: Is the end product of this type of curation useful to biologists? Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
The Methods: • Identify candidate data-rich alleles select annotation from one reference • Re-annotate with Phenotype Ontology System • Identify a. Structural problems? b. Vocabulary problems? c. Practical problems? Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
How much can we pack into one statement? cac1 currently has Genetic interaction: song defective | recessive, enhanceable { mlenap-ts1 } New system would be Genetic interaction: male courtship behavior (sensu Insecta)\, song production | process | abnormal | recessive, enhanceable { mlenap-ts1 } Point - if values were linguistically distinct then no need to state ‘attribute’ in statement as attribute would be implicit from the attribute:value relationship. Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Another example Allele combinations: as in asl1: Phenotype manifest in: (with asl2) spermatocyte & aster New system would be Phenotype: (with asl2) spermatocyte & aster | quantitative | absent again - better were ‘quantitative’ attribute implicit and unstated (with asl2) spermatocyte & aster | absent Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Tricky to express … Free text things: asl1 “some nebenkern are associated with two centrioles instead of one as in wild-type” abnormal relationship between two observables Bsb2 “The laterals split near the distal tip as they elongate and at later stages they appear to be split over more of their length” observables showing progressive defect Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Caa1DAR66 “A few homozygotes exhibit neurons in the longitudinal tracts that appear to stall at some of the commissures, forming nodular growths” behaviour of one observable in relation to another ‘forming nodular growth’? egh27 “Follicle cells at the posterior of egg chamber become mesenchymal-like” transformation of one observable to another Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Equivalent alternatives: motor_neuron ; CL:0000100 motor neuron ; FBbt:00005123 cell ; GO:0005623 cell_in_vivo ; CL:0000003 adult | male_fertility | male sterile | recessive reproduction | male_fertility | male sterile | recessive real issue - if redundancy exists then utility of PO for providing comprehensive access to all phenotypes of same class is compromised Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Structural issue: Sense of compound observable/attribute combination: Observation: “phenotype is in dorsal epithelium that makes the embryonic head” Phenotype: morphogenesis of embryonic epithelium | process | abnormal Phenotype: embryonic head epidermis | dorsal | qualitative | abnormal but combined (better) Phenotype: morphogenesis of embryonic epithelium & embryonic head epidermis | process | abnormal issue - attribute meaningful for one but not other observable - different kind of & Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Curation issues: 1: Need tools browser where only relevant values for specific attribute are presented “statement builder” to assemble compound phenotype terms sequentially from vocabs. 2: Redundancy solver - would certain combinations be disallowed in statement builder? Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
The Answers: • 1: Can the Phenotype Ontology represent all the • controlled data thatFlyBase allele records include? • - potentially yes • 2: Can the Phenotype Ontology reduce the amount • of free text information in FlyBase allele records? • -yes, but probably not eliminate it • 3: Is the proposed system efficient enough for • curators for production database use? • - no • 4: Is the end product of this type of curation • useful to biologists? • in searches yes though redundancy needs resolution, • for browsing? have found users to be tolerant Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005
Phenotype Ontology Meeting Cold Spring Harbor November 19-20th, 2005