1 / 22

Welcome - webinar instructions

Welcome - webinar instructions. The webinar will start soon GoToTraining works best in Chrome or on Linux, Firefox All microphones will be muted while the trainer is speaking If you have a question please use the chat box at the bottom of the GoToTraining box

doane
Download Presentation

Welcome - webinar instructions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome - webinar instructions • The webinar will start soon • GoToTraining works best in Chrome or on Linux, Firefox • All microphones will be muted while the trainer is speaking • If you have a question please use the chat box at the bottom of the GoToTraining box • Please complete the feedback survey which will launch at the end of the webinar • The webinar will be recorded and added to Train online

  2. Variant submission and accessioning at the European Variation Archive Baron Koylass www.ebi.ac.uk/eva eva-helpdesk@ebi.ac.uk

  3. Webinar roadmap • European Variation Archive overview & demo • Variant representation and submission • dbSNP data exchange and release • Take away message and additional information Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  4. European Variation Archive overview & demo https://www.ebi.ac.uk/ena/about/data-repositories Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  5. European Variation Archive overview & demo 900+ million variants 63 species Annotation Metadata Variants • Study • Experiment/Analyses pipeline • Samples • Effect on genes/transcripts • Functional consequence • Population frequencies • Genetic variants from any species • Within/across population, including subspecies Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  6. Variant submission and accessioning at the European Variation Archive EVA WEBSITE DEMO

  7. European Variation Archive overview & demo • Programmatic access through web services API • Results provided in JSON • Easily parsed by Python, R, Java • Web services for • Files, segments (regions), studies, variants, genes • Full documentation at EVA website https://www.ebi.ac.uk/eva/?API Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  8. Variant representation and submission BioSamples Submission metadata Submission metadata ENA Data validation VCF files VCF files EVA PROJECT ACCESSION: PRJEB27789 Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  9. Variant submission and accessioning at the European Variation Archive Metadata and Variant Call Format demo

  10. Variant representation and submission Human Non-human Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  11. Variant representation and submission Initial submitted variant Remapped to assembly 2 Remapped to assembly 1 Chr9 2548987 T A Btau_5.0.1 - GCA_000003205.6 Chr9 2609956 A G Bos_taurus_UMD_3.1.1 - GCA_000003055.5 Chr9 2453564 A G ARS-UCD1.2 - GCA_002263795.2 • ss333 • ss111 • ss222 • rs123456789 Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  12. dbSNP data exchange and release • Import of all non-human variant data from dbSNP. • Variants will be available in the Variant Browser if they satisfy the EVA submission requirements. • dbSNP variants that don't satisfy these requirements will still be imported, and searchable via a separate web view and API. • We will work to make this experience as intuitive as possible, while keeping our commitment to only make high-quality variants part of the core EVA database • First release consists of a variety of files for each species/assembly………. Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  13. dbSNP data exchange and release GCA_000002285.2 _current_ids.vcf *species*_*taxID*_unmapped_ids.txt GCA_000002285.2 _merged_ids.vcf GCA_000002285.2 _deprecated_ids.txt GCA_000002285.2 _merged_deprecated_ids.txt GCA_000002285.2 _current_ids.vcf • Contains the active variants that satisfy the EVA submission requirements • RS IDs which can be browsed from the EVA website • RS ID originally assigned by dbSNP and variants have been validated and can be mapped back to associated assembly • Contig/chromosome name provided in header column Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  14. dbSNP data exchange and release GCA_000002285.2 _current_ids.vcf *species*_*taxID*_unmapped_ids.txt GCA_000002285.2 _merged_ids.vcf GCA_000002285.2 _deprecated_ids.txt GCA_000002285.2 _merged_deprecated_ids.txt *species*_*taxID*_unmapped_ids.txt • Contains RS IDs that couldn't be mapped against an assembly by dbSNP • Flanking sequences are provided when possible Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  15. dbSNP data exchange and release GCA_000002285.2 _current_ids.vcf *species*_*taxID*_unmapped_ids.txt GCA_000002285.2 _merged_ids.vcf GCA_000002285.2 _deprecated_ids.txt GCA_000002285.2 _merged_deprecated_ids.txt GCA_000002285.2 _merged_ids.vcf • Contains RS IDs that should NOT be used • They have been merged into other active RS IDs that represent the same variants • Searchable on EVA website but link to parent RS ID will be provided Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  16. dbSNP data exchange and release GCA_000002285.2 _current_ids.vcf *species*_*taxID*_unmapped_ids.txt GCA_000002285.2 _merged_ids.vcf GCA_000002285.2 _deprecated_ids.txt GCA_000002285.2 _merged_deprecated_ids.txt GCA_000002285.2 _deprecated_ids.txt • Contains a list of RS IDs that should also NOT be used • these RS IDs were deprecated (e.g. due to missing information) Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  17. dbSNP data exchange and release GCA_000002285.2 _current_ids.vcf *species*_*taxID*_unmapped_ids.txt GCA_000002285.2 _merged_ids.vcf GCA_000002285.2 _deprecated_ids.txt GCA_000002285.2 _merged_deprecated_ids.txt GCA_000002285.2 _merged_deprecated_ids.txt • Contains RS IDs that should NOT be used • They have been merged into an RS ID that was deprecated later on Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  18. dbSNP data exchange and release GCA_000002285.2 _current_ids.vcf - Contains the RS IDs which can be browsed from the EVA website. *species*_*taxID*_unmapped_ids.txt - Contains RS IDs that couldn't be mapped against an assembly by dbSNP. Flanking sequences are provided when possible. GCA_000002285.2 _merged_ids.vcf - Contains RS IDs that should NOT be used because they have been merged into other active RS IDs that represent the same variants. GCA_000002285.2 _deprecated_ids.txt - Contains a list of RS IDs that should NOT be used since these RS IDs were deprecated (e.g. due to missing information) GCA_000002285.2 _merged_deprecated_ids.txt - Contains RS IDs that should NOT be used because they have been merged into an RS ID that was deprecated later on. Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  19. Take away message • The EVA provides permanent archival and accessioning for genetic variation from any species where data can be consumed via study/variant browser and API • Variants accepted in VCF format, various tools online to aid in generation of VCF files such as EVA validation suite, EVA VCF template and other 3rd party tools • dbSNP non-human species RS release - useful for previous dbSNP users and those working with non-human species Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

  20. The EVA team Thomas Keane Cristina Yenyxe Gonzalez Andres Silva Kirill Tsukanov Sundararaman Venkataraman Jose Miguel Mut Lopez Baron Koylass Funding

  21. Validation suite for VCF files - https://github.com/EBIvariation/vcf-validator • EVA help page for VCF file generation, including VCF file template - https://wwwdev.ebi.ac.uk/eva/?Help • Official VCF specification - https://samtools.github.io/hts-specs/VCFv4.3.pdf • EVA dbSNP import progress page: https://www.ebi.ac.uk/eva/?dbSNP-Import-Progress Additional information

  22. Upcoming webinars See the full list of upcoming webinars at https://www.ebi.ac.uk/training/webinars Don’t forget! Please fill in the survey that launches after the webinar – thanks! Baron Koylass @evarchive www.ebi.ac.uk/evaeva-helpdesk@ebi.ac.uk

More Related