1 / 13

Data Mining in Ensembl with BioMart

Data Mining in Ensembl with BioMart. Giulietta Spudich EBI, 2007. BioMart. http://www.biomart.org/biomart/martview http://www.ensembl.org/biomart/martview Or click on ‘BioMart’ from Ensembl. BioMart- Data mining.

Download Presentation

Data Mining in Ensembl with BioMart

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining in Ensembl with BioMart Giulietta Spudich EBI, 2007

  2. BioMart http://www.biomart.org/biomart/martview http://www.ensembl.org/biomart/martview • Or click on ‘BioMart’ from Ensembl

  3. BioMart- Data mining • BioMart filters the data in the Ensembl databases, combines multiple terms and puts them into a table format. • Such as: human genes (HGNC IDs), chromosome and base pair position • No programming required!

  4. General or Specific Data-Tables • All the genes for one species • Or… only genes on one specific region of a chromosome • Or… only genes on one specific region of a chromosome that have homologues

  5. Web Interface Dataset Filters: Define the gene set Attributes: Output information Three main stages: Dataset, Filters and Attributes.

  6. Results Tables or sequences

  7. Export tables as… • Microsoft Excel (xls) • Text (csv, tsv) • HTML • GFF • XML Or export sequences in FASTA format

  8. FASTA sequences • Gene (unspliced) • Transcript (cDNA) • Translation (coding) • UTR (5’ or 3’) • Flanking sequence

  9. BioMart – Other Installations Find more at www.biomart.org

  10. The Flow • Choose Dataset (All genes for a species) • Choose Filters (narrows the gene set) • Choose Attributes (output options)

  11. Query: • The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. What other genes related to human disease locate to the same region? Do they have Interpro domains? Filters: what we know Attributes: what we want to know.

  12. Query: • The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. What other genes related to human disease locate to the same region? Do they have Interpro domains? Filters: what we know Attributes: what we want to know.

  13. BioMart team • Arek Kasprzyk • Syed Haider • Richard Holland • Damian Smedley

More Related