1 / 11

Production Priorities

Production Priorities. Genome protein sets User Support Production systems change Database changes On-the-fly species gene associations. Genome protein sets (gp2protein). FASTA files of all proteins believed to occur from a genome, not just what is curated Provide standard defline format

magnar
Download Presentation

Production Priorities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Production Priorities

  2. Genome protein sets • User Support • Production systems change • Database changes • On-the-fly species gene associations

  3. Genome protein sets (gp2protein) • FASTA files of all proteins believed to occur from a genome, not just what is curated • Provide standard defline format • These datasets would be the input for Inparanoid and likely many other analysis projects • [*Future*] Include ID mappings: UniProt, IPI, CCD, MOD IDs, GI, Protein_ID, RefSeq • [*Future*] Mapping from proteins to gene

  4. Only annotations with IMP, IDA, IPI, IGI and IEP

  5. User SupportEmail Lists to report problems • GO • GO-DATABASE • GO-WEBMASTER • GOFRIENDS • GO-IN • GO-TOP • … To which list do we want users to send questions, bug reports, …

  6. Proposal • Define specific email addresses for support • Supported Annotation Staff would be in a rotation to monitor email queries. • Answer questions that can be done so immediately • Forward questions to appropriate person or group as necessary • Track resolution of the query

  7. Currently GODB & AmiGO run on main SGD database server Moving GO to cluster environment where there will be multiple GO DB servers and GO HTML servers Last year started building GO Lite DB three times a week. This means AmiGO is always using 2-4 day old data. More cluster nodes are on order that may allow us to do daily updates. (no need for GOTerm DB) CVS via HTTP Production systems change

  8. Build updating script, currently can only rebuild from scratch Switch to Chado This would also allow the incorporation of new data types GOID & Term history tracking Need to consider what files need to be archived For some file types that are just derivative we can just provide a script Potential Production DB Changes

  9. Survey of GO File Downloads(43 replying)

  10. On-the-fly species gene associations • Mainly useful for multi-species gene associations data sets, eg. GOA UniProt.

More Related