1 / 6

Data Import / Export

Overview of tools for importing/exporitng data in bioinformatics research, including Java & Perl APIs, BioPAX export, multiple file formats, and connections to databases like BioWarehouse.

pjacobson
Download Presentation

Data Import / Export

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Import / Export Markus Krummenacker Bioinformatics Research Group SRI, International Q3 2012

  2. Data Exchange Overview • Java API and Perl API : read & modify • BioPAX Export: since Pathway Tools 9.0 • Biopax.org • Export of entire PGDB as a set of Flatfiles • Export of Reactions as SBML -- sbml.org • Import/Export of Pathways: between PGDBs • Import/Export of Selected Frames, for Spreadsheets • Import/Export of Compounds as Molfile, CML • Registering/Publishing PGDBs on WWW • Export PGDB as Genbank • BioWarehouse : Loader for Flatfiles, SQL access • http://bioinformatics.ai.sri.com/biowarehouse/

  3. Import/Export of Pathways, etc. • Export selected pathways (and related objects) as a file • Import this file into a different PGDB • Can be used for submitting pathways to MetaCyc. See http://metacyc.org/MetaCycPosting.shtml • Visit page of pathway (or object), and right-click choose • Edit->Add Object to File Export List • File->Export->Selected Objects to Lisp-Format File • File->Import->Frames from Lisp-Format File

  4. Dump PGDB into Flatfiles • Export of entire PGDB as Flatfiles • Format Description: http://bioinformatics.ai.sri.com/ptools/flatfile-format.html • Column delimited: 1 line per frame • Attribute-value: 1 record per frame • Multiple slot values: • Column delimited: several values per column • Attribute-value: several lines for several values

  5. Frame Import/Export • Import/Export of Selected Frames, for Spreadsheets • Allows external editing of frames, and also frame creation • Detailed Description: UG section 5.6 • Export: GUI for Frame selection, Slot selection • Slots depend on selected class • Caveat: value annots in slots get lost ! • Direct or all instances under class can be exported • Import: Many choices for merging or replacing data values • File Format Choices like the Flatfiles: • Column delimited: 1 line per frame • Attribute-value: 1 record per frame • Multiple slot values: • Column delimited: several values per column • Attribute-value: several lines for several values

  6. Misc. • Export of a replicon as a Genbank file • Pathologic is the inverse, “Import” • But: information loss, e.g. gene product comments have no feature qualifier in Genbank • Importing protein features from UniProt • Connection to MySQL BioWarehouse needed • See UG section 5.8 • Importing Citations from PubMed

More Related