700 likes | 862 Views
XML for Model Specification: Introduction and Workshop. XML for Model Specification: An Introduction and Workshop An Introduction to XML in the Neurosciences Sharon Crook, Arizona State University An Introduction to NeuroML Fred Howell, University of Edinburgh
E N D
XML for Model Specification: An Introduction and Workshop An Introduction to XML in the Neurosciences Sharon Crook, Arizona State University An Introduction to NeuroML Fred Howell, University of Edinburgh NeuroML for Model Specification in ChannelDB and GENESIS Dave Beeman, University of Colorado, Boulder MorphML: An XML Application for Neuronal Morphology Data Sharon Crook, Arizona State University Building 3-D Network Models with neuroConstruct Padraig Gleeson, University College London Discussion: Current Issues and Future Development
Introduction to XML in the Neurosciences • What is an eXtensible Markup Language (XML) application? • Portable format for computer documents. • Data are surrounded by text descriptions called tags. • Due to the self-describing representation, programs can parse the data easily. • Tags are ordinary text and should be clear, concise, and make sense to humans. • Language elements that provide the structure make up an XML schema. • Each language element may also be equivalent to an object class.
<!-- Segment: mainDend2, ID: 2--> <segment> <id>2</id> <proximal>4</proximal> <distal>5</distal> <parent>0</parent> </segment>
Additional Potential Benefits of XML Schema: • Validate documents/data. • Generate instructions for creating database tables for data element storage and access. • Easily generate data structures and code for reading and writing valid XML documents. • Facilitates communication and collaboration!!! • Potential Liabilities of XML: • May not be clear, concise, easy to read. • Most of the advantages of XML can be accomplished with good discourse, good design, and good documentation. • Extremely verbose so performance will suffer. • Private data accessed more easily.
Additional Benefits of XML: • Neuroinformatics infrastructure (NeuroML Schema and NeuroML Development Kit, MorphML, BrainML, CellML, SBML) • Commercial/free development software available • Schema development, validation, documentation and more Altova XMLSpy http://www.altova.com • XML schema to Java object classes and more Java Architecture for XML Binding (JAXB) http://java.sun.com/xml/jaxb
Relevant XML Applications: • BrainML (http://www.brainml.org) Laboratory of Neuroinformatics, Weill Medical College of Cornell University Examples: time series data, spike trains, experimental protocols, recording sites, bibliographic citations, taxonomy, vital statistics of subject, training statistics, some attributes of neurons for inheritance • SBML (http://www.sbml.org) Systems Biology Markup Language for modeling biochemical reaction networks Examples: metabolic networks, cell-signaling pathways, regulatory networks • CellML (http://www.cellml.org) Bioengineering Institute, University of Auckland Examples: models of cellular and subcellular processes such as calcium dynamics, metabolic pathways, signal transduction • MathML (http://www.w3.org/Math) W3C World Wide Web Consortium • Examples: mathematical notation with structure and content; serve, receive, and process mathematics on the web
An introduction to NeuroML Fred Howell Adaptive and neural computation Informatics University of Edinburgh fwh@inf.ed.ac.uk
Overview • What are the problems? • Object models, data binding and NeuroML • The next steps?
Scratch pad • Ideas for new slides. Collins et al, J Biol Chem 280:7 2005
Aim • Move model specifications from programs to a declarative XML format.
Why XML? • Language independent way to store complex structured information. • Huge industry momentum. • Not a programming language – so encourages declarative specifications. • Possible to transform from one format to another – whereas programs have to be recoded by hand
Why not XML? • Cumbersome to edit by hand • Large text files, need to be compressed • Harder to parse than ad hoc text formats • Not suitable for binary data
Scripts Parameter search Simulation engine Results + visualisations Custom extensions of simulator
Declarative model spec (in XML) Simulation engine Results + visualisations
How would one publish a model? • Put XML model spec on your website, + links to code to run it. • Plus links back to any experimental data used to derive parameters / validate results. • See Robert Gentleman's campaign for “reproducible research” • (and also ModelDB)
Why is this hard? • Lots of levels of scale and detail of models (from protein interactions to large scale networks of neurons) • Different simulators have different and changing capabilities – which creates a moving target for attempts to build any standards
Union or intersection? • Should a model exchange format restrict itself to a standard subset of possible models, or cope with any possible model?
What is “NeuroML”? • A way to map from object trees to XML • ... with a java development kit • ... and some suggestions for sample schemas • channel, cell and network levels • Emphasis on making it easy to define any object model and serialise it • ... create generic tools which work with any object model • ... and encourage developers to agree on common object models where it makes sense
Other XML Languages • SBML : a standard for intracellular pathway models • CellML • MathML
Practicalities • “I'm writing a simulator and I'd like to get the models into NeuroML – what do I do?” • (1) Separate out the declarative aspects of the model spec • (2) Serialise the model into XML, using the NeuroML development kit (in Java) or your own code • (3) If any other developers are creating similar models, see if you can agree on a common set of classes to describe the models by hand
The NeuroML development kit • A “data binding” kit • Start with class definitions • Utilities to read / write model definition as readable XML
Data binding • A technique for serialising data in objects as XML • Your program can read in an XML document using: • Object o = XMLIn(“file.xml”); • And write one using: • Object o = new MyComplexStructure(); • XMLOut(o,”file.xml”); The XML tags can correspond to fields in the class class MyComplexStructure { int position; String sequence; int pubmedID; } <neuroml class=”MyComplexStructure”> <position>1000</position> <sequence>ACGGTTCAG</sequence> <pubmedID>4321652</pubmedID> </neuroml>
Example network definition: package neuroml.model.network; import neuroml.core.*; public class Network extends Element { /** A network has a set of elements - can be populations or individual cells */ public Set elements = new Set("ElementRef"); /** A network also defines a set of projections between elements */ public Set projections = new Set("Projection"); }
Example class definition: public class Grid3DStructure extends PopulationStructure { public int xsize=1; public int ysize=1; public int zsize=1; } ... and so on for all the classes / parameters of your model. Uses a restricted subset of Java as schema definition language: int, double, String Set, Ref, List classes and inheritance namespaces
Do code modules / embedded scripts have a place in NeuroML? Useful for quickly coding loops for running simulations, ad hoc connectivity... but perhaps having any code in the model spec defeats the object?
State of play simulators adopting own XML formats for serialising model descriptions common standards working where the domain is stable (SBML, MorphML)
The next steps? • How much standardisation is useful? • Just XML in any format? • XML with uniform mapping from classes to <tags>? • A set of rigid standards for compartmental neurons, channels, receptors, networks, ...? • What features are needed from a development kit? • C++, python, java?
NeuroML for Model Specification in ChannelDB and GENESIS Dave Beeman University of Colorado, Boulder WAM-BAMM*05
The Problem: One neuronal model --> Many implementations EXAMPLE: Hodgkin-Huxley K channel Equations with parameter values describe the model. Simulator scripts tell the simulator how to implement it. Differences in simulator design --> NEURON and GENESIS scripts look very different --> Very difficult to convert a script to one for a different simulator The Solution: Establish a standard format for a declarative representation, NOT a simulator-dependent procedural representation.
Hodgkin-Huxley K Channel Model Possible Representations • Represent the equations in a form that can be parsed into Java • Store tabulated values of rate variables • Use parameterized form (A + BV) / (C + D exp((E + V)/F))
The ChannelDB Solution (http:/www.modelersworkspace.org/channeldb/ChannelDB.html) • XML representation of a Java Hodgkin-Huxley object with attributes for Gmax, and a set of gates and their exponents • Gate objects have attribute telling if it depends on voltage or concentration, and objects for the forward and backward rate parameters • NeuroML development parser (http://www.neuroml.org) converts between XML representation and Java objects • Use simple Java string manipulation commands to produce a simulation script from information in the fields of the DBChannel object • Prototype database and interface creates commented GENESIS scripts from stored XML channel descriptions
NeuroML representation of the Hodgkin-Huxley K channel <neuroml class="DBChannel" description="Hodgkin-Huxley squid K channel" author="Dave Beeman" keywords="Hodgkin-Huxley potassium squid delayed rectifier" uniqueID="10262778758662F22@dogstar.colorado.edu" notes="An implemention of the GENESIS K_squid_hh channel" Erest="-0.07V"> <channels> <channel name="K_squid_hh" class="HHChannel" permeantSpecie="K" Erev="0.09V" Gmax="360.0S/m^2" ivlaw="ohmic"> <gates> <gate name="X" class="HHVGate" timeUnit="sec" voltageUnit="V" vmin="-0.1" vmax="0.05" instantCalculation="false" useState="false" power="4"> <forwardRate class="ParameterizedHHRate" A="-600.0" B="-10000.0" C="-1.0" D="1.0" E="0.060" F="-0.01"/> <backwardRate class="ParameterizedHHRate" A="125.0" B="0.0 C="0.0" D="1.0" E="0.07" F="-0.08"/> </gate> </gates> <log author="Dave Beeman" date="Jul 9, 2002 11:11:15 PM" literatureReference="A.L. Hodgkin and A.F. Huxley, J. Physiol. (Lond) 117, pp 500-544 (1952)"> <logEntries> </logEntries> </log> </channel> </channels> </neuroml>
Some classes defined for ChannelDB DBChannel: Wrapper class that is used to contain any channel model that is stored in ChannelDB, along with some descriptive information. HHChannel: Class used for all the Hodgkin-Huxley type channels in the database. HHVGate: Used as a member of the gates set of a HHChannel. It contains forward and backward rate objects that depend on voltage, as well as some additional fields to describe the gate. HHCGate: An ionic concentration-dependent gate, analogous to the voltage-dependent HHVGate. It provides an additional field for a reference to the object that provides the source of the ionic concentration. HHRate: The superclass for the specialized forms for the rate variables. ParameterizedHHRate: A subclass of HHRate that expresses rate variables in a parameterized form typical of many Hodgkin-Huxley type rate equations, "rate = (A + BV) / (C + D exp((E + V)/F))" EquationHHRate: A subclass of HHRate that expresses the rate variables as equations. TabulatedHHRate: A subclass of HHRate that allows a gate's forwardRate or backwardRate to be specified by a table at equally spaced voltage (or concentration) points. ConcenPool: Describes a single shell model for a concentration pool, with a buildup of concentration proportional to an incoming current and a time constant for decay. The object providing the source of concentration to a HHCGate is typically formed from this class. The source of currents is provided by a set of objects of class CurrentSource. CurrentSource: Used by ionic concentration pools to provide information about the object that provides an ionic current.
Unfinished Business and Open Questions Extend NeuroML to provide representations for more detailed multi-shell models of calcium diffusion Implement a more sophisticated representation of literature references than the simple string that is currently used in the NeuroML software. (We have proposed a schema for the Modeler's Workspace based on BibTeX.) Software to convert ChannelDB descriptions to NEURON and other simulators Implement the HHCVGate, a two-dimensional gate depending on both voltage and concentration. (Note that the Traub Ca-dependent K channel model uses a form that can be expressed as a product of a HHVGate and a HHCGate.) Implement Borg-Graham or Lytton-Sejnowski temperature-dependent channel models with the NeuroML ThermodynamicHHVGate. Is there a better way for a concentration-dependent channel model to reference the models that provide the source of ionic currents and concentrations? How much standardization should there be for the format and the names of the independent variables and parameters in equation representations?
GENESIS 3 Core – Based on MOOSE The Messaging Object Oriented Simulation Environment a reimplementation of GENESIS base code in C++ by U. S. Bhalla, NCBS, Bangalore Provides: • Improved Messaging between GENESIS objects • Faster, smaller, cleaner implementation • Portable to MS Windows and non-UNIX platforms • Improved equation solvers • Allows multiple parsers and interfaces GENESIS 3 will add: • Graphical interface • XML representation of models • Backwards compatibility with GENESIS 2 • Tutorials and educational applications
WAM-BAMM*05 An XML Application for Neuronal Morphology Data http://www.morphml.org Sharon Crook Arizona State University Department of Mathematics and Statistics School of Life Sciences
WAM-BAMM*05 MorphML XMLSpy Documentation
WAM-BAMM*05 MorphML XMLSpy Documentation
WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <?xml version="1.0" encoding="UTF-8"?> <n:morphml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:n="http://morphml.org/morphml/schema/1.0.0" xsi:schemaLocation="http://morphml.org/morphml/schema/1.0.0 http://math.la.asu.edu/~crook/morphml/MorphML.xsd"> <n:name>SimpleCell</n:name> <n:notes>A Simple cell for testing purposes</n:notes> <n:lengthUnits>Micrometers</n:lengthUnits> <!--Converting cell: SimpleCell--> <n:points> <!-- Start point of segment: Soma, ID: 0--> <n:point> <n:id>0</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>16</n:diameter> </n:point> <!-- End point of segment: Soma, ID: 0--> <n:point> <n:id>1</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0</n:z> <n:diameter>16</n:diameter> </n:point> <!-- Start point of segment: mainDend1, ID: 1--> <n:point> <n:id>2</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>2</n:diameter> </n:point>
WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <!-- End point of segment: mainDend1, ID: 1--> <n:point> <n:id>3</n:id> <n:x>-10.0</n:x> <n:y>-30.0</n:y> <n:z>0</n:z> <n:diameter>2</n:diameter> </n:point> <!-- Start point of segment: mainDend2, ID: 2--> <n:point> <n:id>4</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>2</n:diameter> </n:point> <!-- End point of segment: mainDend2, ID: 2--> <n:point> <n:id>5</n:id> <n:x>10.0</n:x> <n:y>-30.0</n:y> <n:z>0</n:z> <n:diameter>2</n:diameter> </n:point> <!-- Start point of segment: mainAxon, ID: 3--> <n:point> <n:id>6</n:id> <n:x>0.0</n:x> <n:y>0.0</n:y> <n:z>0.0</n:z> <n:diameter>2</n:diameter> </n:point>
WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <n:cells> <n:cell> <n:name>SimpleCell</n:name> <!-- Segments of the cell --> <n:segments> <!-- Segment: Soma, ID: 0--> <n:segment> <n:id>0</n:id> <n:proximal>0</n:proximal> <n:distal>0</n:distal> </n:segment> <!-- Segment: mainDend1, ID: 1--> <n:segment> <n:id>1</n:id> <n:proximal>2</n:proximal> <n:distal>3</n:distal> <n:parent>0</n:parent> </n:segment> <!-- Segment: mainDend2, ID: 2--> <n:segment> <n:id>2</n:id> <n:proximal>4</n:proximal> <n:distal>5</n:distal> <n:parent>0</n:parent> </n:segment> <!-- Segment: mainAxon, ID: 3--> <n:segment> <n:id>3</n:id> <n:proximal>6</n:proximal> <n:distal>7</n:distal> <n:parent>0</n:parent> </n:segment>
WAM-BAMM*05 MorphML: A Simple Example from neuroConstruct <!-- Segment: subAxon1, ID: 4--> <n:segment> <n:id>4</n:id> <n:proximal>8</n:proximal> <n:distal>9</n:distal> <n:parent>3</n:parent> </n:segment> <!-- Segment: subAxon2, ID: 5--> <n:segment> <n:id>5</n:id> <n:proximal>10</n:proximal> <n:distal>11</n:distal> <n:parent>3</n:parent> </n:segment> </n:segments> </n:cell> </n:cells> </n:morphml>
WAM-BAMM*05 Virtual Ratbrain (http://www.ratbrain.org) Laszlo Zaborszky, Peter Varsanyi Center for Molecular and Behavioral Neuroscience, Rutgers Fred Howell, Nicola McDonnell Institute of Adaptive and Neural Computation, University of Edinburgh • Database for peer reviewed 3-D cellular anatomical data of the rat brain • Visualization and analysis tools including analysis of dendritic and axonal morphometry • Data stored in MorphML format
WAM-BAMM*05 Virtual Ratbrain (http://www.ratbrain.org) MorphML Viewer
Padraig Gleeson University College London p.gleeson@ucl.ac.uk WAM-BAMM*05 31 March 2005 Building 3D Network Models with neuroConstruct(Summary of main presentation)