190 likes | 370 Views
An Introduction to the Open Science Data Cloud. October 10, 2013. Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium. 1. Open Science Data Cloud (OSDC). Open Science Data Cloud (OSDC).
E N D
An Introduction to the Open Science Data Cloud October 10, 2013 Heidi Alvarez Florida International University Robert L. Grossman University of ChicagoOpen Cloud Consortium
Open Science Data Cloud (OSDC) • OSDC is a Science Cloud Service Provider (CSP) • Operated by not-for-profit Open Cloud Consortium • OSDC is a 6 PB / 12,000 core science cloud • 1 PB science data for the research community • 1 PB of biomedical data for medical research • We have been doubling in size each year • We run production services for NASA and NIH researchers • Interoperate with Amazon Web Services a.k.a. AWS (still rudimentary) • Hundreds of users (not thousands) • Typical job uses 1000s of core hours over 10-100’s TB
Designed to hold Protected Health Information (PHI) e.g. genomic data, electronic medical records, etc. (HIPAA, FISMA) Science Cloud Biomedical Cloud • Earth sciences • Biological sciences • Social sciences • Digital humanities • ACL, groups, etc.
What You Get with the OSDC • Login with your university credentials via InCommon • Launch virtual machines, virtual clusters, access to large Hadoop clusters, etc. • Access PB+ of open and protected data • Manage files, collections of files, collections of collections • Manage users, groups of users • Manage accounts, sub-accounts • Efficient transfer of large data (UDT, UDR)
www.opencloudconsortium.org • U.S based not-for-profit corporation. • Companies: Cisco, Yahoo!, Infoblox, … • Universities: University of Chicago, Northwestern Univ., Johns Hopkins, Calit2, etc. • Federal agencies and labs: NASA, LLNL, ORNL • International university and government partners • Manages cloud computing infrastructure to support scientific research: Open Science Data Cloud. • Manages cloud computing testbeds: Open Cloud Testbed.
Our Point of View • We want to develop as little technology and software as possible – we want others to develop software and technology. • We focus on providing researchers the ability to compute over large and very large datasets. • We need open source solutions. • Today it is difficult to interoperate with AWS for our protected data cloud, but we expect this to change (someday). • Run lights out over multiple data centers connected with 10G (soon 100G) networks.
OSDC Data Centers and Networks • We have three data centers • Chicago with 100G to StarLight • FIU with 10G to StarLight • Livermore Valley Open Campus 10G to StarLight • We’re planning one more data center with 100G connection to StarLight • We are looking to interoperate the OSDC with international partners over 10G and 100G networks
Challenges • We are focusing on the following: • How do we authenticate, authorize and provide access controls to researchers at our international partners to data and to cloud based services (storage and compute) • We need open source implementations of these services • We need trust relationships with our peers • We are running a series of interoperability workshops to try to get this right.
PARTNERSHIP FOR INTERNATIONAL • RESEARCH AND EDUCATION NSF Award # 1129076
PIRE & OSDC • National Science Foundation Partnership for International Research and Education 5 year program 2010 – 2014 at $3.5M. • Prepares students to compete in the global cyberinfrastructure community • Provides international research and education experiences around the world! • The student/faculty/scientist research teams help develop large-scale distributed computing capabilities, data and, State-of-the-art services for integrating, analyzing, sharing and archiving scientific data.
International Collaborators • Malcolm Atkinson – School of Informatics, Edinburgh University, Scotland, UK • Paola Grosso & Cees de Laat – Faculty of Science, Informatics Institute, University of Amsterdam, The Netherlands • Karen Langona and Tereza Cristina Carvalho- LARC – Laboratory of Computer Networks and Architecture at the University of Sao Paulo Brazil • Satoshi Sekiguchi – National Institute of Advanced Industrial Science and Technology (AIST),Japan • Chung-I Wu – Beijing Institute of Genomics (BIG), Chinese Academy of Sciences
Research Opportunities • What? • Funded internships for US citizens and residents, which provide the chance to participate in sophisticated international research collaborations. • When? • Summer of 2014 • How long? • 6 weeks • Where? • At any of our international partners.
OSDC Community of Scholars /Open-Science-Data-Cloud-PIRE-OSDC-PIRE /Open Cloud Consortium • Websites: news.opensciencedatacloud.org , opensciencedatacloud.org • Mailing list, Summer Workshops