220 likes | 421 Views
MAML. XML Based solution to storing and transmitting microarray data Capable of storing all data necessary to create a MIAME data set Information available at www.sourceforge.net/projects/mged. Information Available online. Current DTD MAML Primer (Alex Lash et al .)
E N D
MAML • XML Based solution to storing and transmitting microarray data • Capable of storing all data necessary to create a MIAME data set • Information available at www.sourceforge.net/projects/mged
Information Available online • Current DTD • MAML Primer (Alex Lash et al.) • A few sample data sets
MAML Key concepts • Biomaterial (sample) hierarchy • Image (hybridization, bioassay) definitions • Data matrices • Array patterns and array features
Biomaterials (Sample) Hierarchy • Directed acyclic graph (DAG) based system for describing the relationship between biological specimens
DAG Model of Biomaterials Sample 1 Treatment 2 Treatment 1 Treatment 3 Sample 3 Sample 2 Treatment 4 Sample 4
Treatments and Biomaterials Sample 1 Treatment Sample 2
Array Pattern • Coordinates of each feature on an array • Reference to biological sequence (potentially future compounds) on that element
Composite Features • Synthesis of two or more features into an entity that represents something else • Two features that contain identical DNA may be averaged • Several features that that have DNA that corresponds to the same biological entity (several cDNA clones that map to the same UniGene cluster)
Bioassay (Image) • Image based results from the bioassay (single image) • To represent data from multiple images (i.e. ratios) a composite image is generated • Images can also be combined to store average values from multiple experiments
MAML Proposals for Data Storage • Tagged data • Each data point is tagged with information on the feature, image, and data type it is • Matrices • Internal or External representations • External data can be binary (XRD specification)
Data Matrices • Microarray data can be considered a 3 dimensional matrix • Axes are • Assays (arrays) • Features (spots) • Data Types (eg. Ratio, intensity, flag value)
Data Matrices (continued) • Each data matrix stored is a 2-D slice of the 3-D matrix • The matrix slice is specified (eg. ratio, image1, feature1) • The data values are recorded in one of various format (i.e. white space delimited)
MAML History • First drafts discussed at MGED2 • Second drafts at MAML working group meeting (November 2000) • Submission of MAML to the OMG (December 2000)
The OMG • International organization dedicated to developing specifications • Issued a request for specification systems for storing gene expression data
Why go through the OMG • OMG provides a structure for developing consensus based standards • Other organizations were using OMG (desire to have a single solution) • OMG submissions
OMG Submission History • Proposals were reviewed by OMG RFP guidelines • Submitters meet to attempt to take each of the submissions, identify strengths and figure out how to produce s solution that satisfies all submitters
Results of OMG Discussions • Most areas are in agreement • Areas of disagreement discussed at MAML working group yesterday
Changes to MAML from the OMG Process • Decision to allow data related data sets in more than one document (more easily allows transport) • Requires removal of ID attributes (and IDREFs) • IDs will be replaced with names that the user must specify are unique
MAML Tutorial • Session to go through MAML format (DTD) to understand how it operates • Will discuss likely changes to DTD that will affect understanding of the specification • Discuss ways to integrate MAML into your database/data analysis system
MAML Software Development • Certain pieces of software necessary to support MAML DTD • Annotation tool • MAML parsers • Peer to Peer communication tools?
Software Development Proposal • Crash programming session • Goal to build necessary tools listed on previous page in a short (one week) type of time frame