250 likes | 379 Views
The European Genome- phenome Archive. Accessing and submitting controlled access data. Jeff Almeida-King User Support & Outreach. EGA Session summary. Introduction to the EGA A ccessing data (+hands on) Submitting sequence (+hands on). EBI: Open and Controlled access archives.
E N D
The European Genome-phenome Archive Accessing and submitting controlled access data Jeff Almeida-King User Support & Outreach
EGA Session summary • Introduction to the EGA • Accessing data (+hands on) • Submitting sequence (+hands on)
EBI: Open and Controlled access archives Structural variants Phenotypes Sequence Variants EVA Open & public archives Controlled access archive
What is EGA controlled access data anyway? • Usually human, personally identifiable (Genetic, phenotypic) • Affiliated to medical research or consortium projects • Requires secure storage and distribution • Access determined by formal application procedure • Informed consents specifying controlled release requirements
EGA Overview • Launched 14th July 2008 • 2500+ data access accounts; 125+ submission accounts • 700Tb+ data archived (200,000 samples); 600 datasets in distribution • Data access decisions made by a Data Access Committee (DAC)
Using Your EGA Account • Distribution requires data preparation • Download accounts require manual set-ups • Download accounts expire in 14 days • Data must be decrypted using keys supplied offline
Secure EGA Downloader (Pilot stage) Log in/select transfer protocol Create key Select destination directory Filter and select files Download!
Secure EGA Downloader (Pilot phase) Specify key Select directories Select files Decrypt!
Exercises: Accessing data (Hands-on) www.ebi.ac.uk/ega/tutorials/accessing_data • Navigating to your study of interest • Using your EGA acount • Downloading data • Directory: [Shared_folder]/ESGI/EGA/downloading_data/<token_name> (exercise 3)
Why Submit to the EGA? Funders and journals increasingly require researchers to have a data sharing plan. 1) Wellcome Trust "Policy on data management and sharing” http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.html 2) Nature "Availability of data and materials” http://www.nature.com/authors/editorial_policies/availability.html 3) Public Library of Science (PloS) "Sharing of Materials, Methods, and Data”http://www.plosone.org/static/policies.action#sharing
Additional requirements for an EGA submission • Submission statements & DAC access policy • File preparation: Data encryption • Register DAC, Policy and Dataset
How to submit to the EGA? ega-helpdesk@ebi.ac.uk 1 2 3 4 EGA submission account details Using EGA uploader (encrypts, generates md5sums and uploads) Provide submission metadata
EGA Webin data uploader Log in/select transfer protocol Select source directory Select submission files Encrypt and upload!
EGA sequencing metadata EGA Webin STUDY (EGAS) Study SAMPLES (EGAN) Samples Experiment EXPERIMENT (EGAX) Run RUNS (EGAR) Accessioned XML objects Dataset DATASET(EGAD) POLICY (EGAP) Policy DATA ACCESS COMMITTEE(EGAC) submission objects DAC
EGA sequencing analysis submissions STUDY (EGAS) EGA Webin • Aligned BAM, Variant Call Files (VCF) and phenotype files • Relatively small data size SAMPLES (EGAN) Study • Submission of both raw and analysis files encouraged • Data upload process remains the same ANALYSIS (EGAZ) Samples Analysis DATASET (EGAD) POLICY(EGAP) Accessioned XML objects Dataset Policy DATA ACCESS COMMITTEE (EGAC) DAC XML objects
Acknowledgments Vasudev Kumanduri Scientific Programmer Paul Flicek Justin Paschall Team Leader Team Leader Vertebrate Genomics Variation Ilkka Lappalainen Variation Archive Project Leader Alexander Senf Scientific Programmer Saif Ur-Rehman Scientific Programmer Jag Kandasamy Web Developer
Exercises: Submitting sequence www.ebi.ac.uk/ega/tutorials/submitting_sequence • Uploading your files using the EGA Webin Uploader • Registering study, samples, experiments and runs • Registering your Data Access Committee, Policy and creating your dataset • Directory: /[Shared_folder]/ESGI/EGA/submitting_sequence/<token_name>