120 likes | 272 Views
Annotation. EPP 245/298 Statistical Analysis of Laboratory Data. Annotation. Given that one has found one of more genes that are differentially expressed, there are a number useful things to know What is the putative function? What pathways are know to contain this gene?
E N D
Annotation EPP 245/298 Statistical Analysis of Laboratory Data
Annotation • Given that one has found one of more genes that are differentially expressed, there are a number useful things to know • What is the putative function? • What pathways are know to contain this gene? • What other proteins interact with the given protein? • etc. EPP 245 Statistical Analysis of Laboratory Data
Two-color array example > alldata[1,] [1] 473 888 170 1137 86 290 109 226 370 659 359 484 102 293 174 [16] 324 196 638 102 293 > geneID[1,] Name ID 1 NM_006182 discoidin domain receptor family, member http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml EPP 245 Statistical Analysis of Laboratory Data
Affy Example > library(annaffy) Loading required package: GO Loading required package: KEGG Loading required package: annotate > probeids <- geneNames(eset.rma)[allp1adj < .05] > symbols <- aafSymbol(probeids,"hgu95av2") Loading required package: hgu95av2 > symbols[[1]] An object of class "aafSymbol" [1] "DDR1" > getText(symbols[[1]]) [1] "DDR1" > gos <- aafGO(probeids,"hgu95av2") EPP 245 Statistical Analysis of Laboratory Data
> gos[[1]] An object of class "aafGO" [[1]] An object of class "aafGOItem" @id "GO:0005524" @name "ATP binding" @type "Molecular Function" @evid "IEA" [[2]] An object of class "aafGOItem" @id "GO:0007155" @name "cell adhesion" @type "Biological Process" @evid "IEA" [[3]] An object of class "aafGOItem" @id "GO:0007155" @name "cell adhesion" @type "Biological Process" @evid "TAS" EPP 245 Statistical Analysis of Laboratory Data
[[4]] An object of class "aafGOItem" @id "GO:0005887" @name "integral to plasma membrane" @type "Cellular Component" @evid "TAS" [[5]] An object of class "aafGOItem" @id "GO:0016020" @name "membrane" @type "Cellular Component" @evid "IEA" [[6]] An object of class "aafGOItem" @id "GO:0006468" @name "protein amino acid phosphorylation" @type "Biological Process" @evid "IEA" EPP 245 Statistical Analysis of Laboratory Data
[[7]] An object of class "aafGOItem" @id "GO:0004674" @name "protein serine/threonine kinase activity" @type "Molecular Function" @evid "IEA" [[8]] An object of class "aafGOItem" @id "GO:0004872" @name "receptor activity" @type "Molecular Function" @evid "IEA" [[9]] An object of class "aafGOItem" @id "GO:0016740" @name "transferase activity" @type "Molecular Function" @evid "IEA" EPP 245 Statistical Analysis of Laboratory Data
[[10]] An object of class "aafGOItem" @id "GO:0004714" @name "transmembrane receptor protein tyrosine kinase activity" @type "Molecular Function" @evid "IEA" [[11]] An object of class "aafGOItem" @id "GO:0004714" @name "transmembrane receptor protein tyrosine kinase activity" @type "Molecular Function" @evid "TAS" [[12]] An object of class "aafGOItem" @id "GO:0007169" @name "transmembrane receptor protein tyrosine kinase signaling pathway" @type "Biological Process" @evid "IEA" EPP 245 Statistical Analysis of Laboratory Data
GO Evidence Codes • IEA = inferred from electronic annotation (e.g., BLAST). Uncurated • TAS = traceable author statement (i.e., someone said so). EPP 245 Statistical Analysis of Laboratory Data
IDA = inferred from direct assay • IEP = inferred from expression pattern • IGI = inferred from genetic interaction • IMP = inferred from mutant phenotype • IPI = inferred from physical interaction • ISS = inferred from sequence similarity • NAS = non-traceable author statement • ND = no biological data available • NR = not recorded EPP 245 Statistical Analysis of Laboratory Data
Online Access > gbs <- aafGenBank(probeids,"hgu95av2") > getURL(gbs[[1]]) [1] "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=nucleotide&term=U48705%5BACCN%5D&doptcmdl=GenBank" > lls <- aafLocusLink(probeids,"hgu95av2") > getURL(lls[[1]]) [1] "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=780" EPP 245 Statistical Analysis of Laboratory Data
Abstracts > pmids <- aafPubMed(probeids,"hgu95av2") > pmids[[1]] An object of class "aafPubMed" [1] 15111304 14764702 14500648 12935821 12477932 9659899 8977099 8796349 8682498 8622863 8390675 8302582 [13] 8226977 7848919 7789998 7774938 > getURL(pmids[[1]]) [1] "http://www.ncbi.nih.gov/entrez/query.fcgi?tool=bioconductor&cmd=Retrieve&db=PubMed&list_uids=15111304%2c14764702%2c14500648%2c12935821%2c12477932%2c9659899%2c8977099%2c8796349%2c8682498%2c8622863%2c8390675%2c8302582%2c8226977%2c7848919%2c7789998%2c7774938“ > browseURL(getURL(lls[[1]])) EPP 245 Statistical Analysis of Laboratory Data