120 likes | 214 Views
Growing the Tree of Life at SDSC: Kick-Off Meeting. Mark A. Miller San Diego Supercomputer Center. OTHER SDSC/MULTI-SITE PROJECTS in BIOLOGY Alliance for Cell Signaling Encyclopedia of Life Joint Center for Structural Genomics Protein Data Bank Protein Kinase Resource
E N D
Growing the Tree of Life at SDSC:Kick-Off Meeting Mark A. Miller San Diego Supercomputer Center
OTHER SDSC/MULTI-SITE PROJECTS in BIOLOGY • Alliance for Cell Signaling • Encyclopedia of Life • Joint Center for Structural Genomics • Protein Data Bank • Protein Kinase Resource • Lipid Maps (Lipid Metabolomics)
Scientific Community View Providers Tool Providers Domain Scientists • Disciplinary Interfaces • Molecular Biology • Structural Biology • Algorithm developers Virtual Collections Virtual Collections Virtual Collections Computational Resources Local Resources Linux Clusters “The GRID” SMPs Discovery Portal Informatics Center Integrated Views Federated Data Collections
TOL Project functions: • Algorithm development • Model development • Methodologies for model evaluation • Database methodologies • Software development • Scalability/High end computing • Integration technology • Education/outreach
Role of SDSC: • Unite the users with resources to do their science. • Create standards for the resource in partnership w/community. • Design a reference architecture that supports interoperability, evolvability, and extensibility. • Provide a common point of contact for project members and the community. • Create a resource that gives persistence to the work created under this funding.
For each project function we must ask: • What are the overall goals? • What resources do we have presently? • Where is IT support limited? • What are the requirements for delivering on goals? • How can we formalize communication?
Of the project management we must ask: • How shall we prioritize the allocation of resources to goals? • How much effort to stabilize/wrap legacy SW (and which SW)? • What are trade offs of architectural specifications (SW and HW)? • What are the computational requirements (grid vs local) ? • What functions do we need to provide immediately? • What is the right mix of short term/long term investment?
Synergies: Features under development in other SDSC Projects • HEC (DataStar: IBM power4; 7 tf/ 235 GB memory; SunFire 12K) • Web/grid services (Blue Titan) • Workflows (Ptolemy II; Scitegic) • e-Notebook (partnership with Kinematik; internal efforts) • Molecular visualization tools (mbt.sdsc.edu) • Biological data federation (IBM DL federation; • http://quatermass.sdsc.edu:8080/federation/index.html )
Where do we start? • Create infrastructure that supports communication.http://landscape.sdsc.edu:8080/PHYLO/ • Create a specifications document/project plan for each functional area. • Map the specifications to existing resources; strategize around weaknesses. • Propose one or more architectures that map to the groups • needs.
Proposed timeline • November 2003 Inception: • possible architectures described. • requirements for systems identified • project planning and staffing, • Life Cycle Objectives documentation with at least one feasible system architecture. • February 2004 Elaboration: • System Software Architecture Description created. • Life Cycle Architecture review and commitment. • October 2004 Construction • Major coding of product occurs, • culminating in the delivery of "alpha" capability to the user together with documentation.
Mark Miller – Management/Logistics/Strategy David Stockwell – Data Base/Architecture Alex Borchers – Architecture/Data delivery tools/ User requirements Dana Jermanis – Graphic Art/Web Master SDSC Tree of Life Team
SOAP Server Virtual community messaging XML/RDF store Metadata sharing BLAST Data Keyword data Stored queries Annotations Session info Database TOL Notebook Encyclopedia of Life SOAP Queries Invoke Scheduler BLAST Keyword queries