Welcome - webinar instructions

Welcome - webinar instructions • The webinar will start soon • GoToTraining works best in Chrome or on Linux, Firefox • All microphones will be muted while the trainer is speaking • If you have a question please use the chat box at the bottom of the GoToTraining box • Please complete the feedback survey which will launch at the end of the webinar • The webinar will be recorded and added to Train online

An Introductory Webinar Wojtek Bazant & Faye Rodgers https://parasite.wormbase.org parasite-help@sanger.ac.uk

Outline • Why WormBase ParaSite? • Our genomes • Data available • BioMart • Questions

Why WormBase ParaSite? • Helminths (parasitic roundworms and flatworms) are the causative agents of many diseases of humans, animals and plants • Increasing amounts of genomic data are becoming available to the helminth research community • WormBase ParaSite processes and presents that data in a consistent and accessible way

Genomes and primary annotation (from the community) Analyses run for all genomes Protein domain prediction, GO term annotation, repeat annotation, ncRNA annotation, alignment of publicly available RNASeq data, linking IDs to external databases Comparative analysis Build gene trees incorporating all genomes in the release (plus comparators) to predict orthologues and paralogues. Website - browsing • Gene and species pages • JBrowse REST API Website - tools • BLAST • BioMart

Structure and features of the front page

Our genomes

Genome and species descriptions

Finding information related to your scientific question If you know the gene name or ID, it’s just a search task! Otherwise, it more like research. Common avenues: • BLAST the sequence • Text search to try match a gene description • Search through a protein feature or GO term • Navigate through an orthologous gene in other species

Data available for each gene

Transcript and protein pages

“Region in detail” - embedded genome browser

Alternative genome browser – JBrowse Better for a workbench view with multiple tracks

Links and references - UniProt etc.

Literature

Comparative Genomics Gene trees are computed with every release, classifying genes into families. These are reconciled with the species trees to infer orthologous and paralogous relationships. Speciation node Duplication node Tree views can be configured for exploring the gene family https://www.ensembl.org/info/genome/compara/homology_method.html

Comparative Genomics Eg, highlight all of the paralogues:

Comparative Genomics Orthologues and paralogues are also available in tabular format: • Lists can be exported from BioMart • Full gene trees can be accessed programmatically via the API

BioMart A very powerful tool for accessing data in bulk without any programming knowledge. Filters can be combined to build more complex queries Filters The data type you’re basing your query on, eg: Genome Genomic region A list of gene IDs All genes annotated with a protein domain or a GO term All genes that have an orthologue in a species Values The actual data you’re basing your query on, eg: Schistosoma mansoni PRJEA36577 Schistosoma mansoni Sm_V7_1 Smp_035270, Smp_010250, Smp_244010… SignalP Genes with an orthologue in Schistosoma haematobium Attributes The data you want, eg: Protein stable IDs cDNA sequences Uniprot IDs Protein domains Orthologue names, % identity

BioMart Walk-through example: using BioMart to retrieve S. mansoni genes from the ZW chromosome that have an orthologue in S. japonicum and S. haematobium. Want to return the S. mansoni, S. haematobium and S. japonicum gene IDs.

To access BioMart from the home page

Add a species filter

Add a region filter

Add homology filters

Count how many genes fulfil our filter criteria

Select output attributes

Previewing the results we get by default

Add orthologues to output attributes

Scroll down to find the species that we’re interested in

View a preview of your output, and download full results.

BioMart • For a list of gene IDs: • Convert to other types of identifier (Uniprot, RefSeq, NCBI) • Retrieve associated protein domains, GO terms • Retrieve their genomic coordinates • Generate FASTA files of protein, cDNA, UTR, flanking region sequences etc • Retrieve a list of genes that: • Have a given protein domain/GO term • Have/do not have orthologues in species X,Y,Z. • Are on genomic region X Other examples of questions that can be answered with BioMart: For R users, WormBase ParaSite BioMart supports the biomaRt R package: see our help and documentation pages to get started.

Outline • Why WormBase ParaSite? • Our genomes • Data available • BioMart • Questions

Outline If we don’t get to your question: email parasite-help@sanger.ac.uk • Why WormBase ParaSite? • Our genomes • Data available • BioMart • Questions

Sample question I need the sequences for a set of Schistosoma mansoni genes. I have the chromosome, start, and stop for each. The suggested option Other, more creative approaches? • download the GFF and the sequence files from the FTP, and write a program • check the cases one by one • use the API, first „region” endpoint to get gene IDs, then „sequence” endpoint • email the helpdesk ( it might work )

BioMart Example 2 Using BioMart to generate a protein FASTA file from a list of gene IDs

Select filter(s).

Paste in gene IDs.

In output attributes, select “Retrieve sequence”

Select the type of sequence we’re interested in. Select the information we’d like in the FASTA header.

Preview and download output.

Upcoming webinars See the full list of upcoming webinars at https://www.ebi.ac.uk/training/webinars Don’t forget! Please fill in the survey that launches after the webinar – thanks!

Welcome - webinar instructions

Welcome - webinar instructions

Presentation Transcript

Welcome to the Webinar

Welcome - webinar instructions

Webinar Instructions

Webinar Instructions November 17 and 22

Webinar Instructions

Webinar Instructions

Welcome Skills Iowa Webinar

Welcome Webinar Attendees

Welcome to the webinar

Webinar Instructions

Welcome to the {WEBINAR NAME} Webinar

Basic Webinar Instructions

Welcome to our Webinar

Webinar Instructions

Essex Scouts Webinar : Welcome!

Webinar Instructions

Webinar Audio/Instructions

Welcome NESC Webinar #3

Welcome Webinar!

Welcome Webinar

Webinar Instructions

Welcome to our webinar!