110 likes | 278 Views
Tengcha – generic middleware for retrieving data from Chado. Justin Reese GMOD Meeting April 5, 2012 . Summary. Tengcha is a plug-in to the Trellis framework that allows data to be read from Chado Written in Java
E N D
Tengcha – generic middleware for retrieving data from Chado Justin Reese GMOD Meeting April 5, 2012
Summary • Tengcha is a plug-in to the Trellis framework that allows data to be read from Chado • Written in Java • Tengcha is used by WebApollo to read data from our Chado db’s to help people manually annotate • Tengcha can be used as generic middleware to: • read data from Chadodbs • output as Das or Jbrowse style JSON • Source code lives here on Google code: https://genomancer.googlecode.com/svn/trunk
Reading data into WebApollo Trellis Framework Servlet UCSC MySQL database PokaPlugin Web Apollo DAS request to SQL DAS HTTP Request w/ format modifier DB response to DAS DAS Data Model Model to JBrowse JSON Ivy Plugin Ensembl DAS request to SQL DB response to DAS Jbrowse-flavored JSON (NClists) BAM alignment files BAM to JBrowse JSON NClist
Problem – lots of data in Chado databases • Much of our (and others’) data lives in Chado databases: • protein alignments • gene calls • RNAseq data/expression data • etc. • Could convert data to JSON and get JBROWSE to handle the data, but it’d be easier if we pulled it directly from Chado database
Reading data into WebApollo Trellis Framework Servlet TengchaPlugin Chado DAS request to SQL Web Apollo DAS HTTP Request w/ format modifier GBOL DB response to DAS DAS Data Model UCSC MySQL database PokaPlugin Model to JBrowse JSON DAS request to SQL DB response to DAS Ivy Plugin Ensembl DAS request to SQL DB response to DAS Jbrowse-flavored JSON (NClists) BAM alignment files BAM to JBrowse JSON NClist
Tengcha • Trellis is a java-based plug-in to Trellis framework • Trellis can read data from many places: • UCSC (via Poka plug-in) • DAS servers (via Ivy plug-in) • previously no plug-in to read data from Chado • Trellis can output data in a few formats: • Das2 • JSON (Jbrowse-flavored JSON) • Possibly Das1 in the future? • Design goals: • should read data from all standard Chado databases (not just our Chado databases) with data loaded using GMOD bulk loader, with very minimal configuration • should be easily configurable to read data from non-standard Chado database • should be reasonably fast (Chado is normalized, can be slow…) • should be thoroughly unit-tested
Configuring Tengcha • Configurable items: • how to connect to Chado – database host, id/pw, port: genomancer/tengcha/src/hibernate_cfg.xml • cv and cvterm of reference sequence features (default: scaffold): genomancer/tengcha/Config.java • cvterm for parent/child relationships in featurerelationshipcvterms (default: part_of, derived_from): genomancer/tengcha/Config.java • Configuration for non-standard Chado: • edit hibernate XML mappings for Chado tables,
Tengcha as a generic tool for reading from Chado • Easy interoperability b/tChado and anything that speaks Das • Output Chado features in • Das (XML) • Nested-containment lists (JSON) • Caching of painful reads (highly configurable caching through hibernate) • Java-based, if you like that sort of thing
For the Chado mavens • Relevant tables: feature featureloc featurerelationship analysis analysisfeature cv cvterm • If you haven’t altered these, your non-standard Chados should work out of the box…
Live demo • Source code lives here on Google code: https://genomancer.googlecode.com/svn/trunk • We’d be glad to help you hook it up to your Chado
ขอบคุณ • LBNL • Ed Lee • Gregg Helt • Nomi Harris • Suzanna Lewis • UC Berkeley • Mitch Skinner • Rob Buels • Ian Holmes • Georgetown University • Chris Childers • Justin Reese • Mónica Muñoz-Torres • Jay Sundaram • Christine Elsik