440 likes | 670 Views
Virtual Laboratory: Exploring e-Science in CAS CNIC,CAS Jianjun Yu yujj@cnic.cn. Outline. Motivation e-Science and Virtual Laboratory Current Work on VLAB Planning in Future. The Definition of e-Science. Three kinds of e-Science:
E N D
Virtual Laboratory: Exploring e-Science in CAS CNIC,CAS Jianjun Yu yujj@cnic.cn
Outline • Motivation • e-Science and Virtual Laboratory • Current Work on VLAB • Planning in Future Second EchoGrid Workshop Beijing – 29 & 30 October 2007
The Definition of e-Science • Three kinds of e-Science: • Computationally intensive science in highly distributed network environments • Science that uses immense data sets that require grid computing • Distributed collaboration, such as the Access Grid • Examples: • social simulations, particle physics, earth sciences, bio-informatics, … • Cited from wikipedia Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Why e-Science • Challenges in modern science research (from the view of scientific researchers ) • Science problems are more complex than ever • Science research object is not isolated, but cross discipline and large-scale • Science data processing, simulation and computing become indispensable methods • need more and more communication, collaboration, coordination among them closer than ever Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Why e-Science • From the perspective of resources • Great total amount while with limited utilization • Lack of effective usage • Urgently requiring for more capabilities • People need to collaborate sharing their resources and knowledge Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Why e-Science • From the perspective of IT: • What researchers need • Huge demand for computing technologies • Huge demand for sharing the resource / collaboration • What provides • High speed network • HPC • Large scale of scientific data • How to provide • e-Science environment • Give scientists a more EASY-TO-USE interface to help them using the infrastructure Second EchoGrid Workshop Beijing – 29 & 30 October 2007
What e-Science do • Provide a research collaborative environment • Combining all resources: including computing, network, data, human,…. Videoconference & On-line Forum Communication & Collaboration Experiments & Field Stations Supercomputer Center Observation &Experiment Computing Facility Database Storage Facility Specimen Library Computation & Simulation Documentation Theory Analysis Networks Software Tools Cited from CAS 11th Five-year Informatization Program (2006- 2010) Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Outline • Motivation • e-Science and Virtual Laboratory • Current Work on VLAB • Planning in Future Second EchoGrid Workshop Beijing – 29 & 30 October 2007
CAS 11th Five-year Informatization Program (2006- 2010) • Continue to develop the infrastructure and existing applications • e-Science Facility • Networks of field stations/instruments(60), Mobile equip., Digital library of natural resources • e-Science Applications • HEP, Astro, Bio, Geo, Chemistry, …(about 10 or 16) • Resource Integration Platform • Supporting Environment for Applications Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Our vision of e-Science • Virtual Laboratory based e-Science • An integrated collaboration environment to supporte-Science • Composed of shared and collaborative hardware, software, data, information, human, …. • An EASY-TO-USE interface for scientific researchers • A basic form and tool for e-Science activities Second EchoGrid Workshop Beijing – 29 & 30 October 2007
e-Science VS. e-Science Virtual Labs • e-Science would be applications-driven • “Virtual Labs” , the key position in our e-Science framework ,the core component to make e-Science a reality • Should emphasize that Virtual Labs don’t mean make experiments or simulations online Cited from CAS 11th Five-year Informatization Program (2006- 2010) Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Virtual Labs Solves • Currently: • Infrastructure may be (almost) ready, but e-Science is not yet. • many existing resources in place, but just a few could be brought into public. • bottleneck may be the gap between products by computer experts and end users of domain scientists • much more effort than expected to bridge this gap • Virtual Lab is proposed to be • a basic unit of research activity in the e-Science environment • a connector between Infrastructure/resources and domain scientists • the right user interface between scientists and their e-Science environment Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Virtual Labs Goals • With Virtual Labs, • all kinds of resources could be integrated into a single access point • customized and flexible services would be provided according to the specific requirements of different domains in an easier way than ever before • Multidisciplinary, multi-organization collaboration could be carried out online. Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Features of Virtual Labs • Ease of use • much easier to use than current systems • Resource integration • provide the user with a single operating environment • many kinds of resources, such as supercomputers, mass storage facilities, scientific databases, digital libraries, high bandwidth network, scientific equipment, etc. could be accessed in a seamless way. • Customized service • provide a user with what he or she wants completely and exactly. • Each user may have a specific workbench individually. . • Ubiquitous research • benefit from state-of-the-art technologies on mobile computing and related so that user could use the Virtual Lab at any time and anywhere. Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Features of Virtual Labs (cont’) • Collaborative work • enable a lot of scientists, who are from multiple independent institutions, from multiple sites across the world, and from different professional backgrounds, to work together on a collaborative project or a common problem. • Scalability • support hundreds of users from tens of institutions, but should work just as well for three or five users. • Management • interact with outer management systems, such as ARP in CAS, to help improve efficiency during the whole lifetime of a research project or other research activities. Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Virtual Labs Goals in 2006- 2010 • Provide a virtual lab environment • Scientific Virtual Organization management • Scientists can access, organize and manage resource easily and transparently • Integrate resources from CAS, such as computing, storage, database, library • Document share and collaboration • Organize Internet-based activities, such as project review, conference • At least three e-Science applications on Virtual Labs • Biology • Astronomy • High Energy Physics • ... Second EchoGrid Workshop Beijing – 29 & 30 October 2007
VLAB Projects • Exploration of e-Science Virtual Lab with our vision • A product under Virtual Lab concept • We focus more on how to develop an e-Science environment for real scientific application Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Architecture of VLAB Virtual Workbench CA Resources & services Scientific Workflow Core toolkit AV Plugin VO management Computing Resource Plugin Device Plugin Document Collaboration Database Plugin Other Plugins Activity Collaboration Didital Library Plugin Resource Resource Resource Resource Resource Resource Second EchoGrid Workshop Beijing – 29 & 30 October 2007
New technologies Advancing VLAB • SOA • Grid computing • Web Services • Scientific workflow • Portal (WSRP) • … Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Research/standards go ahead • Put forward theory, technology framework for VLAB project • Solve key technologies • Provide the guide of VLAB project • Publish Universal standards to integrate different kinds of resources Second EchoGrid Workshop Beijing – 29 & 30 October 2007
The core components of VLAB • Virtual Workbench • Core Toolkits • e-Science Security Infrastructure • Virtual Labs services Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Virtual Workbench • A universal portal for e-Science activities • Open, scalable, flexible integrated platform for different resources • Support flexible requirements from scientists • Support component reuse • Assure usable and accessible Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Virtual Workbench (cont’) • Application centered portlet can be plugged into workbench • Now support : • Resource addin • Calendar management • Instance message • Document Share • Virtual Community • Mail subscribe • application can be capsulated as portlet and added into the virtual workbench Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Core Toolkits • Virtual organization management tool • Document collaboration tool • Activity organization tool Second EchoGrid Workshop Beijing – 29 & 30 October 2007
e-Science Security Infrastructure • Construct a PKI based security infrastructure • CAS e-Science CA • Be authorized by APGrid PMA, 2006 • Be trusted by IGTF • Based on OpenCA package Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Virtual Labs Services • Work and develop with the front-line scientists • Applied to real scientific application • Biology • Astronomy • High energy physics • …. Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Current Work on VLAB • SDG(Scientific Data Grid) • Data processing centered e-Science • Virtual Lab Core Toolkit • Document Share Tool • VO Management Tool • Activity Management Tool • AVLAB (Astronomical Virtual Lab) • e-Science application Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Scientific Data Grid • Introduction of Scientific Database • SDB is a long-term project since 1983, in which there are multi-disciplinary scientific data accumulated through the course of science activities in CAS. • many institutes involved • long-term, large-scale collaboration • data from research, for research • Aimed at: • connecting massive data resources in Scientific Database • realizing effective sharing of those geographically distributed, heterogeneous and autonomous data resources via grid computing technology, especially data grid technology Second EchoGrid Workshop Beijing – 29 & 30 October 2007
SDB status • 45 institutes across 16 cities • 503 databases • 16.6TB total volume
What SDG provides • DAS • Data Access Service • IMS • Grid Information Service • Security Infrastructure • Storage Service • SDG Portal • SDG Toolkit Second EchoGrid Workshop Beijing – 29 & 30 October 2007
http://www.sdg.ac.cn Scientific Data Grid Scientific Data Grid Applications Avian Flu Alert System Other Grids Virtual Observatory Grid High Energy Physics Grid Scientific Data Grid Middleware Scientific Database Second EchoGrid Workshop Beijing – 29 & 30 October 2007
DAS • Goal: • Access over forty scientific databases in CAS • Integrate immense scientific data sets • Provide universal interfaces for researchers • Functions • Metadata extraction • Database schema mapping • Block data set • GT4 compatible Second EchoGrid Workshop Beijing – 29 & 30 October 2007
SDG Portal Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Document Share Tool • Feature • Share files for VO teams • User-labeled File tag for more exact classification • Document Search engine • Automatic summary generation • ACL or RBAC based privilege control • Duckling • Wiki based document publishing, editing • A web-based Document share portal • CLB: Document Sharing tools • File upload, lock, unlock, update • REST API • CLB-U: Document share client • Windows shell application • Office Add-in(support office xp/2003/2007) Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Duckling -Wiki Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Duckling-file upload/update Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Ducking-Search Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Duckling-Tag Second EchoGrid Workshop Beijing – 29 & 30 October 2007
VO Management Tool • UMT • Hierarchical Directed Acyclic Graph based model • Group can has subgroups, and subgroups can has multiple parent group • Members of one group would inherit the roles and privileges from its parents • ACL or RBAC based resource access control Second EchoGrid Workshop Beijing – 29 & 30 October 2007
UMT Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Road Map • To develop Virtual Lab version 1.0 • Popularize the VLAB environment Vlab applications Protein Astronomy 2006.12 Vlab framwork design 2006.12 Vlab proposal Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Planning in Future • Select typical science areas to deploy Virtual Lab • Biology • Astronomy • High Energy Physics • ... • step by step, case by case, project by project and worldwide cooperation to actualize the e-Science of CAS • Need more international cooperations! Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Summary • e-Science or science researches through cyberinfrastructure will be one of the main goals of CNIC,CAS in the next five years • e-Science need more international collaborations on cyberinfrastructure and e-Science applications • Merging scientific domain and IT, not only in IT technology and scientific knowledge, but also in human, e.g. e-scientist Second EchoGrid Workshop Beijing – 29 & 30 October 2007
Thanks! Second EchoGrid Workshop Beijing – 29 & 30 October 2007