220 likes | 281 Views
Cyberinfrastructure Status. July, 2011. 2011 – First half. NSF reverse site visit Refactoring and cleanup after review preparations Coordinating Node technology changes (reducing latency, generalizing) Authentication Identity management Access control TeraGrid Compute Node design.
E N D
Cyberinfrastructure Status July, 2011
2011 – First half • NSF reverse site visit • Refactoring and cleanup after review preparations • Coordinating Node technology changes (reducing latency, generalizing) • Authentication • Identity management • Access control • TeraGrid Compute Node design
Goals for 2011 • Public release of cyberinfrastructure • Synchronization, replication, authentication, authorization, resolution, discovery, retrieval • Coordinating Nodes • UNM, UCSB, ORC • Member Nodes • KNB, Dryad, ORNL-DAAC, AKN, NBII, CUAHSI, replication targets, Merritt, other Metacat, other Mercury • Investigator Toolkit • Libraries, Web interface, Zotero, R, Morpho, DataONE Drive
The Major Components Investigator Toolkit Client Libraries Web Interface Analysis, Visualization Data Management Command Line Java Python Member Nodes Coordinating Nodes Service Interfaces Service Interfaces Resolution Discovery Replication Registration Bridge to non-DataONE Services Identifiers Catalog Identity Authen Preservation Monitor Data Repository Object Store Index
Software Delivered at Public Release Investigator Toolkit Software Mendeley SearchPortal R Client Morpho … Zotero DataONE FS (Excel) Client Libraries Command Line Java Python DataONE Service Programming Interface (SPI) Member Node Software Coordinating Node Software Service Interfaces • Metacat • Dryad Resolution Registration Replication Discovery • GMN • (CUASHI) Identifiers Catalog Preservation Monitor • (Merritt) Object Store Index
Member Node Tiers • Tier 1: Public read, synchronization, search, resolve • Tier 2: Read with access control • Tier 3: Write using client tools • Tier 4: Able to operate as a replication target • Some Member Nodes with Tier 4 support and corresponding Coordinating Node functionality required for 2011
Member Nodes Status: 0 = Not implemented, 1=partial, 2=functional, 3=complete Priority: 1 = High, 2= medium, 3= low
Coordinating Node Services Status: 0 = Not implemented, 1=partial, 2=functional, 3=complete Priority: 1 = High, 2= medium, 3= low
Coordinating Node Components cn_rest cn_rest_proxy Mercury JMS cn_service Search Metacat identity_manager nodes_registry synchro-nization indexer replic-ation LDAP SOLR Postgres
Investigator Toolkit Status: 0 = Not implemented, 1=partial, 2=functional, 3=complete Priority: 1 = High, 2= medium, 3= low
Data Lifecycle Morpho
Other Pieces Status: 0 = Not implemented, 1=partial, 2=functional, 3=complete Priority: 1 = High, 2= medium, 3= low
Versions • Infrastructure will change over time. • Message between services • Behavior of services, components • Message semantics • How to accommodate change while maintaining operations?
Infrastructure + Component Versions • Infrastructure Version: Tag identifying the overall functional capabilities, types, messages supported by the DataONE infrastructure • Component Version: Tag of a specific component indicating the current revision of the component • Example: Version XXX of Metacat supports version YYY of the DataONE infrastructure
DataONE Infrastructure Release • Changes to Types Schema • Update test data • Build libcommonPyxb / JIBX • Propagate changes through libclient • Propagate changes through remainder of components • Deploy CN test • Update test Member Nodes • Deploy staging • Update / synchronize content to match production • Switch staging and production • Where to update all Member Node stacks? • Where to update all Investigator Tools
Component Updates • Release Coordinating Nodes • Touches everything deployed – MNs, ITK, monitor, ... • Likely to require content migration, synchronization • Expensive • Update Member Nodes • Vendor driven updates (e.g. core changes to Metacat) • DataONE driven updates (e.g. access policy fix) • Known contact points for updates • Update client tools • Automated version checks? • How to notify users? • How to notify third party developers?
Version Latency and Acceptable Delta • For an infrastructure upgrade: • What is the acceptable latency between change announcement and infrastructure back in sync? • Should DataONE maintain old versions of service interfaces? If so how many revisions? • Some changes may not be backward compatible • Option to make old version CNs readonly while content migrated to new systems?
Administrative Process for Updates • New package available with clear indication of affected components, signed off by component developers • Affect CNs: • Install on staging, process metadata updates, lock production, switch. • Affect MNs: • Notify instance administrators of update • Install updates and verify operation • Registry picks up updated services • Affect ITK: • Post update information • Notify active clients of update • Notify third party developers of changes