1 / 70

XML for Model Specification: Introduction and Workshop

Learn the basics of XML and its applications in the field of neuroscience. Discover the benefits and potential liabilities of using XML schema for data validation and communication. Explore relevant XML applications such as NeuroML, BrainML, SBML, and CellML.

cporter
Download Presentation

XML for Model Specification: Introduction and Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML for Model Specification: Introduction and Workshop

  2. XML for Model Specification: An Introduction and Workshop An Introduction to XML in the Neurosciences Sharon Crook, Arizona State University An Introduction to NeuroML Fred Howell, University of Edinburgh NeuroML for Model Specification in ChannelDB and GENESIS Dave Beeman, University of Colorado, Boulder MorphML: An XML Application for Neuronal Morphology Data Sharon Crook, Arizona State University Building 3-D Network Models with neuroConstruct Padraig Gleeson, University College London Discussion: Current Issues and Future Development

  3. Introduction to XML in the Neurosciences • What is an eXtensible Markup Language (XML) application? • Portable format for computer documents. • Data are surrounded by text descriptions called tags. • Due to the self-describing representation, programs can parse the data easily. • Tags are ordinary text and should be clear, concise, and make sense to humans. • Language elements that provide the structure make up an XML schema. • Each language element may also be equivalent to an object class.

  4. <!-- Segment: mainDend2, ID: 2--> <segment> <id>2</id> <proximal>4</proximal> <distal>5</distal> <parent>0</parent> </segment>

  5. Additional Potential Benefits of XML Schema: • Validate documents/data. • Generate instructions for creating database tables for data element storage and access. • Easily generate data structures and code for reading and writing valid XML documents. • Facilitates communication and collaboration!!! • Potential Liabilities of XML: • May not be clear, concise, easy to read. • Most of the advantages of XML can be accomplished with good discourse, good design, and good documentation. • Extremely verbose so performance will suffer. • Private data accessed more easily.

  6. Additional Benefits of XML: • Neuroinformatics infrastructure (NeuroML Schema and NeuroML Development Kit, MorphML, BrainML, CellML, SBML) • Commercial/free development software available • Schema development, validation, documentation and more Altova XMLSpy http://www.altova.com • XML schema to Java object classes and more Java Architecture for XML Binding (JAXB) http://java.sun.com/xml/jaxb

  7. Relevant XML Applications: • BrainML (http://www.brainml.org) Laboratory of Neuroinformatics, Weill Medical College of Cornell University Examples: time series data, spike trains, experimental protocols, recording sites, bibliographic citations, taxonomy, vital statistics of subject, training statistics, some attributes of neurons for inheritance • SBML (http://www.sbml.org) Systems Biology Markup Language for modeling biochemical reaction networks Examples: metabolic networks, cell-signaling pathways, regulatory networks • CellML (http://www.cellml.org) Bioengineering Institute, University of Auckland Examples: models of cellular and subcellular processes such as calcium dynamics, metabolic pathways, signal transduction • MathML (http://www.w3.org/Math) W3C World Wide Web Consortium • Examples: mathematical notation with structure and content; serve, receive, and process mathematics on the web

  8. An introduction to NeuroML Fred Howell Adaptive and neural computation Informatics University of Edinburgh fwh@inf.ed.ac.uk

  9. Overview • What are the problems? • Object models, data binding and NeuroML • The next steps?

  10. Scratch pad • Ideas for new slides. Collins et al, J Biol Chem 280:7 2005

  11. Orig version by Ding Fan

  12. Models we’d like to build…

  13. Aim • Move model specifications from programs to a declarative XML format.

  14. Why XML? • Language independent way to store complex structured information. • Huge industry momentum. • Not a programming language – so encourages declarative specifications. • Possible to transform from one format to another – whereas programs have to be recoded by hand

  15. Why not XML? • Cumbersome to edit by hand • Large text files, need to be compressed • Harder to parse than ad hoc text formats • Not suitable for binary data

  16. Scripts Parameter search Simulation engine Results + visualisations Custom extensions of simulator

  17. Declarative model spec (in XML) Simulation engine Results + visualisations

  18. How would one publish a model? • Put XML model spec on your website, + links to code to run it. • Plus links back to any experimental data used to derive parameters / validate results. • See Robert Gentleman's campaign for “reproducible research” • (and also ModelDB)

  19. Why is this hard? • Lots of levels of scale and detail of models (from protein interactions to large scale networks of neurons) • Different simulators have different and changing capabilities – which creates a moving target for attempts to build any standards

  20. Union or intersection? • Should a model exchange format restrict itself to a standard subset of possible models, or cope with any possible model?

  21. What is “NeuroML”? • A way to map from object trees to XML • ... with a java development kit • ... and some suggestions for sample schemas • channel, cell and network levels • Emphasis on making it easy to define any object model and serialise it • ... create generic tools which work with any object model • ... and encourage developers to agree on common object models where it makes sense

  22. Other XML Languages • SBML : a standard for intracellular pathway models • CellML • MathML

  23. Practicalities • “I'm writing a simulator and I'd like to get the models into NeuroML – what do I do?” • (1) Separate out the declarative aspects of the model spec • (2) Serialise the model into XML, using the NeuroML development kit (in Java) or your own code • (3) If any other developers are creating similar models, see if you can agree on a common set of classes to describe the models by hand

  24. The NeuroML development kit • A “data binding” kit • Start with class definitions • Utilities to read / write model definition as readable XML

  25. Data binding • A technique for serialising data in objects as XML • Your program can read in an XML document using: • Object o = XMLIn(“file.xml”); • And write one using: • Object o = new MyComplexStructure(); • XMLOut(o,”file.xml”); The XML tags can correspond to fields in the class class MyComplexStructure { int position; String sequence; int pubmedID; } <neuroml class=”MyComplexStructure”> <position>1000</position> <sequence>ACGGTTCAG</sequence> <pubmedID>4321652</pubmedID> </neuroml>

  26. Example network definition: package neuroml.model.network; import neuroml.core.*; public class Network extends Element { /** A network has a set of elements - can be populations or individual cells */ public Set elements = new Set("ElementRef"); /** A network also defines a set of projections between elements */ public Set projections = new Set("Projection"); }

  27. Example class definition: public class Grid3DStructure extends PopulationStructure { public int xsize=1; public int ysize=1; public int zsize=1; } ... and so on for all the classes / parameters of your model. Uses a restricted subset of Java as schema definition language: int, double, String Set, Ref, List classes and inheritance namespaces

  28. Do code modules / embedded scripts have a place in NeuroML? Useful for quickly coding loops for running simulations, ad hoc connectivity... but perhaps having any code in the model spec defeats the object?

  29. State of play simulators adopting own XML formats for serialising model descriptions common standards working where the domain is stable (SBML, MorphML)

  30. The next steps? • How much standardisation is useful? • Just XML in any format? • XML with uniform mapping from classes to <tags>? • A set of rigid standards for compartmental neurons, channels, receptors, networks, ...? • What features are needed from a development kit? • C++, python, java?

  31. NeuroML for Model Specification in ChannelDB and GENESIS Dave Beeman University of Colorado, Boulder WAM-BAMM*05

  32. The Problem: One neuronal model --> Many implementations EXAMPLE: Hodgkin-Huxley K channel Equations with parameter values describe the model. Simulator scripts tell the simulator how to implement it. Differences in simulator design --> NEURON and GENESIS scripts look very different --> Very difficult to convert a script to one for a different simulator The Solution: Establish a standard format for a declarative representation, NOT a simulator-dependent procedural representation.

  33. Hodgkin-Huxley K Channel Model Possible Representations • Represent the equations in a form that can be parsed into Java • Store tabulated values of rate variables • Use parameterized form (A + BV) / (C + D exp((E + V)/F))

  34. The ChannelDB Solution (http:/www.modelersworkspace.org/channeldb/ChannelDB.html) • XML representation of a Java Hodgkin-Huxley object with attributes for Gmax, and a set of gates and their exponents • Gate objects have attribute telling if it depends on voltage or concentration, and objects for the forward and backward rate parameters • NeuroML development parser (http://www.neuroml.org) converts between XML representation and Java objects • Use simple Java string manipulation commands to produce a simulation script from information in the fields of the DBChannel object • Prototype database and interface creates commented GENESIS scripts from stored XML channel descriptions

  35. NeuroML representation of the Hodgkin-Huxley K channel <neuroml class="DBChannel" description="Hodgkin-Huxley squid K channel" author="Dave Beeman" keywords="Hodgkin-Huxley potassium squid delayed rectifier" uniqueID="10262778758662F22@dogstar.colorado.edu" notes="An implemention of the GENESIS K_squid_hh channel" Erest="-0.07V"> <channels> <channel name="K_squid_hh" class="HHChannel" permeantSpecie="K" Erev="0.09V" Gmax="360.0S/m^2" ivlaw="ohmic"> <gates> <gate name="X" class="HHVGate" timeUnit="sec" voltageUnit="V" vmin="-0.1" vmax="0.05" instantCalculation="false" useState="false" power="4"> <forwardRate class="ParameterizedHHRate" A="-600.0" B="-10000.0" C="-1.0" D="1.0" E="0.060" F="-0.01"/> <backwardRate class="ParameterizedHHRate" A="125.0" B="0.0 C="0.0" D="1.0" E="0.07" F="-0.08"/> </gate> </gates> <log author="Dave Beeman" date="Jul 9, 2002 11:11:15 PM" literatureReference="A.L. Hodgkin and A.F. Huxley, J. Physiol. (Lond) 117, pp 500-544 (1952)"> <logEntries> </logEntries> </log> </channel> </channels> </neuroml>

  36. Some classes defined for ChannelDB DBChannel: Wrapper class that is used to contain any channel model that is stored in ChannelDB, along with some descriptive information. HHChannel: Class used for all the Hodgkin-Huxley type channels in the database. HHVGate: Used as a member of the gates set of a HHChannel. It contains forward and backward rate objects that depend on voltage, as well as some additional fields to describe the gate. HHCGate: An ionic concentration-dependent gate, analogous to the voltage-dependent HHVGate. It provides an additional field for a reference to the object that provides the source of the ionic concentration. HHRate: The superclass for the specialized forms for the rate variables. ParameterizedHHRate: A subclass of HHRate that expresses rate variables in a parameterized form typical of many Hodgkin-Huxley type rate equations, "rate = (A + BV) / (C + D exp((E + V)/F))" EquationHHRate: A subclass of HHRate that expresses the rate variables as equations. TabulatedHHRate: A subclass of HHRate that allows a gate's forwardRate or backwardRate to be specified by a table at equally spaced voltage (or concentration) points. ConcenPool: Describes a single shell model for a concentration pool, with a buildup of concentration proportional to an incoming current and a time constant for decay. The object providing the source of concentration to a HHCGate is typically formed from this class. The source of currents is provided by a set of objects of class CurrentSource. CurrentSource: Used by ionic concentration pools to provide information about the object that provides an ionic current.

  37. Unfinished Business and Open Questions Extend NeuroML to provide representations for more detailed multi-shell models of calcium diffusion Implement a more sophisticated representation of literature references than the simple string that is currently used in the NeuroML software. (We have proposed a schema for the Modeler's Workspace based on BibTeX.) Software to convert ChannelDB descriptions to NEURON and other simulators Implement the HHCVGate, a two-dimensional gate depending on both voltage and concentration. (Note that the Traub Ca-dependent K channel model uses a form that can be expressed as a product of a HHVGate and a HHCGate.) Implement Borg-Graham or Lytton-Sejnowski temperature-dependent channel models with the NeuroML ThermodynamicHHVGate. Is there a better way for a concentration-dependent channel model to reference the models that provide the source of ionic currents and concentrations? How much standardization should there be for the format and the names of the independent variables and parameters in equation representations?

  38. Unbundling GENESIS

  39. GENESIS 3 Core – Based on MOOSE The Messaging Object Oriented Simulation Environment a reimplementation of GENESIS base code in C++ by U. S. Bhalla, NCBS, Bangalore Provides: • Improved Messaging between GENESIS objects • Faster, smaller, cleaner implementation • Portable to MS Windows and non-UNIX platforms • Improved equation solvers • Allows multiple parsers and interfaces GENESIS 3 will add: • Graphical interface • XML representation of models • Backwards compatibility with GENESIS 2 • Tutorials and educational applications

  40. Planned GENESIS 3 Interfaces

  41. WAM-BAMM*05 An XML Application for Neuronal Morphology Data http://www.morphml.org Sharon Crook Arizona State University Department of Mathematics and Statistics School of Life Sciences

  42. WAM-BAMM*05 MorphML XMLSpy Documentation

  43. WAM-BAMM*05 MorphML XMLSpy Documentation

  44. WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <?xml version="1.0" encoding="UTF-8"?> <n:morphml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:n="http://morphml.org/morphml/schema/1.0.0" xsi:schemaLocation="http://morphml.org/morphml/schema/1.0.0 http://math.la.asu.edu/~crook/morphml/MorphML.xsd"> <n:name>SimpleCell</n:name> <n:notes>A Simple cell for testing purposes</n:notes> <n:lengthUnits>Micrometers</n:lengthUnits> <!--Converting cell: SimpleCell--> <n:points> <!-- Start point of segment: Soma, ID: 0--> <n:point> <n:id>0</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>16</n:diameter> </n:point> <!-- End point of segment: Soma, ID: 0--> <n:point> <n:id>1</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0</n:z> <n:diameter>16</n:diameter> </n:point> <!-- Start point of segment: mainDend1, ID: 1--> <n:point> <n:id>2</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>2</n:diameter> </n:point>

  45. WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <!-- End point of segment: mainDend1, ID: 1--> <n:point> <n:id>3</n:id> <n:x>-10.0</n:x> <n:y>-30.0</n:y> <n:z>0</n:z> <n:diameter>2</n:diameter> </n:point> <!-- Start point of segment: mainDend2, ID: 2--> <n:point> <n:id>4</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>2</n:diameter> </n:point> <!-- End point of segment: mainDend2, ID: 2--> <n:point> <n:id>5</n:id> <n:x>10.0</n:x> <n:y>-30.0</n:y> <n:z>0</n:z> <n:diameter>2</n:diameter> </n:point> <!-- Start point of segment: mainAxon, ID: 3--> <n:point> <n:id>6</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>2</n:diameter> </n:point>

  46. WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <n:cells> <n:cell> <n:name>SimpleCell</n:name> <!-- Segments of the cell --> <n:segments> <!-- Segment: Soma, ID: 0--> <n:segment> <n:id>0</n:id> <n:proximal>0</n:proximal> <n:distal>0</n:distal> </n:segment> <!-- Segment: mainDend1, ID: 1--> <n:segment> <n:id>1</n:id> <n:proximal>2</n:proximal> <n:distal>3</n:distal> <n:parent>0</n:parent> </n:segment> <!-- Segment: mainDend2, ID: 2--> <n:segment> <n:id>2</n:id> <n:proximal>4</n:proximal> <n:distal>5</n:distal> <n:parent>0</n:parent> </n:segment> <!-- Segment: mainAxon, ID: 3--> <n:segment> <n:id>3</n:id> <n:proximal>6</n:proximal> <n:distal>7</n:distal> <n:parent>0</n:parent> </n:segment>

  47. WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <!-- Segment: subAxon1, ID: 4--> <n:segment> <n:id>4</n:id> <n:proximal>8</n:proximal> <n:distal>9</n:distal> <n:parent>3</n:parent> </n:segment> <!-- Segment: subAxon2, ID: 5--> <n:segment> <n:id>5</n:id> <n:proximal>10</n:proximal> <n:distal>11</n:distal> <n:parent>3</n:parent> </n:segment> </n:segments> </n:cell> </n:cells> </n:morphml>

  48. WAM-BAMM*05 Virtual Ratbrain (http://www.ratbrain.org) Laszlo Zaborszky, Peter Varsanyi Center for Molecular and Behavioral Neuroscience, Rutgers Fred Howell, Nicola McDonnell Institute of Adaptive and Neural Computation, University of Edinburgh • Database for peer reviewed 3-D cellular anatomical data of the rat brain • Visualization and analysis tools including analysis of dendritic and axonal morphometry • Data stored in MorphML format

  49. WAM-BAMM*05 Virtual Ratbrain (http://www.ratbrain.org) MorphML Viewer

  50. Padraig Gleeson University College London p.gleeson@ucl.ac.uk WAM-BAMM*05 31 March 2005 Building 3D Network Models with neuroConstruct(Summary of main presentation)

More Related