1 / 25

The European Genome- phenome Archive

The European Genome- phenome Archive. Accessing and submitting controlled access data. Jeff Almeida-King User Support & Outreach. EGA Session summary. Introduction to the EGA A ccessing data (+hands on) Submitting sequence (+hands on). EBI: Open and Controlled access archives.

urban
Download Presentation

The European Genome- phenome Archive

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The European Genome-phenome Archive Accessing and submitting controlled access data Jeff Almeida-King User Support & Outreach

  2. EGA Session summary • Introduction to the EGA • Accessing data (+hands on) • Submitting sequence (+hands on)

  3. EBI: Open and Controlled access archives Structural variants Phenotypes Sequence Variants EVA Open & public archives Controlled access archive

  4. What is EGA controlled access data anyway? • Usually human, personally identifiable (Genetic, phenotypic) • Affiliated to medical research or consortium projects • Requires secure storage and distribution • Access determined by formal application procedure • Informed consents specifying controlled release requirements

  5. EGA Overview • Launched 14th July 2008 • 2500+ data access accounts; 125+ submission accounts • 700Tb+ data archived (200,000 samples); 600 datasets in distribution • Data access decisions made by a Data Access Committee (DAC)

  6. Consortium data archived at the EGA

  7. Guide to accessing data

  8. EGA Distribution model for controlled access

  9. Obtaining an EGA Account

  10. Using Your EGA Account • Distribution requires data preparation • Download accounts require manual set-ups • Download accounts expire in 14 days • Data must be decrypted using keys supplied offline

  11. Secure EGA Downloader (Pilot stage) Log in/select transfer protocol Create key Select destination directory Filter and select files Download!

  12. Secure EGA Downloader (Pilot phase) Specify key Select directories Select files Decrypt!

  13. Exercises: Accessing data (Hands-on) www.ebi.ac.uk/ega/tutorials/accessing_data • Navigating to your study of interest • Using your EGA acount • Downloading data • Directory: [Shared_folder]/ESGI/EGA/downloading_data/<token_name> (exercise 3)

  14. Submitting sequence

  15. Why Submit to the EGA? Funders and journals increasingly require researchers to have a data sharing plan. 1) Wellcome Trust "Policy on data management and sharing” http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.html 2) Nature "Availability of data and materials” http://www.nature.com/authors/editorial_policies/availability.html 3) Public Library of Science (PloS) "Sharing of Materials, Methods, and Data”http://www.plosone.org/static/policies.action#sharing

  16. Who is submitting to the EGA?

  17. Additional requirements for an EGA submission • Submission statements & DAC access policy • File preparation: Data encryption • Register DAC, Policy and Dataset

  18. How to submit to the EGA? ega-helpdesk@ebi.ac.uk 1 2 3 4 EGA submission account details Using EGA uploader (encrypts, generates md5sums and uploads) Provide submission metadata

  19. EGA Webin data uploader Log in/select transfer protocol Select source directory Select submission files Encrypt and upload!

  20. EGA sequencing metadata EGA Webin STUDY (EGAS) Study SAMPLES (EGAN) Samples Experiment EXPERIMENT (EGAX) Run RUNS (EGAR) Accessioned XML objects Dataset DATASET(EGAD) POLICY (EGAP) Policy DATA ACCESS COMMITTEE(EGAC) submission objects DAC

  21. EGA Webin

  22. EGA sequencing analysis submissions STUDY (EGAS) EGA Webin • Aligned BAM, Variant Call Files (VCF) and phenotype files • Relatively small data size SAMPLES (EGAN) Study • Submission of both raw and analysis files encouraged • Data upload process remains the same ANALYSIS (EGAZ) Samples Analysis DATASET (EGAD) POLICY(EGAP) Accessioned XML objects Dataset Policy DATA ACCESS COMMITTEE (EGAC) DAC XML objects

  23. Acknowledgments Vasudev Kumanduri Scientific Programmer Paul Flicek Justin Paschall Team Leader Team Leader Vertebrate Genomics Variation Ilkka Lappalainen Variation Archive Project Leader Alexander Senf Scientific Programmer Saif Ur-Rehman Scientific Programmer Jag Kandasamy Web Developer

  24. Exercises: Submitting sequence www.ebi.ac.uk/ega/tutorials/submitting_sequence • Uploading your files using the EGA Webin Uploader • Registering study, samples, experiments and runs • Registering your Data Access Committee, Policy and creating your dataset • Directory: /[Shared_folder]/ESGI/EGA/submitting_sequence/<token_name>

More Related