310 likes | 418 Views
MIAMExpress development October 2002 Mohammad shojatalab shoja@ebi.ac.uk. Talk structure. History Underlying concepts Design & Development Current status Future. History (March 2001 ~ present). Need for a submission tool for ArrayExpress Obviously It should be a web base tool
E N D
MIAMExpress development October 2002 Mohammad shojatalab shoja@ebi.ac.uk
Talk structure • History • Underlying concepts • Design & Development • Current status • Future
History(March 2001 ~ present) • Need for a submission tool for ArrayExpress • Obviously It should be a web base tool • It was supposed to be quick and dirty prototype • Start about May 2001
Underlying concepts • Based on MIAME concepts and questionnaire • Submission of Experiment, Arrays, Protocols • Avoiding free text as much as possible • Using controlled vocabulary
Login Pending/New Experiment En En En En E1 E1 E1 E1 E2 E2 E2 E2 Samplen Sample1 Sample2 Sample3 Sample protocol Extracts 1…n Extracts 1…n Extracts 1…n Extracts 1…n Extraction protocol Hyb protocol Hybridisations Array1 Array2 Array3 Arrayn Scanning protocol Data1 Data2 Data3 Datan Image analysis protocol Transformation protocol Combined Experiment Data Submit Final free text comment
Design considerations • Complex submission structure • Long submission time (may be weeks) • So it needs a database • It meant to be fast, Open source and free • Usable as a Lab Notebook • Free database management system • Web based submissions • MAGE-ML file as output
Technologies & Tools • Using MySQL DBMS for its database • Using Perl CGI technology • Using DBI, DBD::MySQL to interact with db • Using javascript • CVS as source code repository to keep track of changes and also incorporate changes made by other developers in to all developers working copy.
MIAMExpress GUI, MAGExpress, … Biology Layer Data acces data Biology;MIAME Layer Data access functions Physical data layer (Database, files,…)
Team work culture • MIAMExpress team • Biologists (Helen and curation team, External people) • Development team; Myself,Niran (Jan 2002), Sergio (Jun 2002) • “Everyone owns all the code so whenever something is busted everyone has a right and duty to fix it” • “Successful culture has to accept that mistakes will happen”
Development • Started at May 2001 • Simple data model; around 30 tables • Avoiding hard code in program • Debug tools to assist developer • Log functionality which write down user’s activities in a file to assist developer to find out what is wrong if an controlled error happened. • Readable and maintainable code
New Requirement Essential information for development Requirement Analysis Impact on data model Impact on data access layer and APIs Impact analysis Impact on UI Impact on MAGExpress New Release Beta version Development Test New Release Development
Steady state Requirement Analysis Impact Analysis Development Test MIAMExpress development states
Infrastructure Test Developers Production Apache HTTP Apache HTTP CVS
Submissions Types • Array submission: Array Description File (Excel sheet, tab delimited files) • Experiment submission • experiment design,samples,extractions,labelled extracts,protocols (Web based forms) • Hybridization data files (Excel sheet, tab delimited files) • combined data file (Excel sheet, tab delimited files)
MAGE-ML Creation; MAGExpress module • MAGEstk: A set of APIs which are created from OM and are able to read MAGE-ML file and create object structure and vice versa • To create MAGE-ML file from MIAMExpress we have to know which piece of data fits where in the model • that means we need map MIAMExpress Data model(schema) to MAGE-OM • Still we have problem because MAGE classes are quite abstract in terms of working with physical data
MAGE-ML Creation (2) • To have a object oriented design we need to have a new set of classes which are derived from MAGE classes but are MIAMExpress specific • Objects which are instantiated from these new set of classes know how get their attributes from MIAMExpress database • Also they are inheriting all of the properties of their parents
MAGE-ML Creation; Example • biosrc instantiated from original BioSource class • mx_biosrc instantiated from MXBioSource class • MXBioSource class is inherited from original BioSource class • biosrc.go&Load_your_data -> ERROR!!! • mx_biosrc.go&Load_your_data -> you have a loaded object from MIAMExpress • Encapsulation of dirty works inside the object
How go&Load_your_data works? • By calling our one or a sequence of appropriate biology(MIAME) APIs. • Note that the method name in all Classes is the same but behaviour is context sensitive which is hard coded inside; • For example ; for BioSource we say; • mx_biosrc. go&Load_your_data; • And also for labelled extract; • Mx_label. go&Load_your_data and all others the like.
How we create MAGE-ML file • We Create the top level object; experiment • experiment.go&create_all_your_associations; which creates whole object structure • experiment.go&Load_your_data; which load data to whole structure • write MAGE-MLby starting from experiment top level object
Problem with new releases • Usually new releases means new schema as well. • that means we have to change the data access layer. • that means we have to change our mapping model which map MIAMExpress schema to MAGE model. • that means we have to migrate the existing data. • that introduce data migration module; dmm • dmm is have to be provided with new release
Missing bits behind the scene • We are using MySQL which at least at the moment doesn’t support constraints and foreign keys, sub queries. • They have promised to to solve these in their next releases soon. • Persistent connection with database. • Having an automated schema MAGE mapping.
Current status • Release 1.0 is ready soon (Dec 2002) • With this we get the experiment submissions, and create the MAGE-ML file • get the Array submissions and create the MAGE-ML file Related project • ILSI specific MIAMExpress
Future • KeyLargoExpress? ;) • Organism specific • Integrated with Curation tool ?? • be able to work with a full MAGE-ML file? • cover all missing pieces behind the scene • Being implemented using Java related technologies?!