140 likes | 242 Views
XSEDE Operations. Patricia Kovatch, Victor Hazlewood , Justin Whitt. Randy Butler, Chris Jordan, Stephen McNally, Steve Quinn, Troy Baer, Linda Winkler. XSEDE Operations. Improve user productivity through enhanced Ease of use Reliability Quality assurance
E N D
XSEDE Operations Patricia Kovatch, Victor Hazlewood, Justin Whitt Randy Butler, Chris Jordan, Stephen McNally, Steve Quinn, Troy Baer, Linda Winkler
XSEDE Operations • Improve user productivity through enhanced • Ease of use • Reliability • Quality assurance • Track metrics to gauge our success and continually improve
Operations (1.5 FTE) Patricia Kovatch (.5) Victor Hazlewood (.5) Justin Whitt (.5) NICS Software Support – 3.25 FTEs Troy Baer, NICS (.5) Stuart Martin, GRAM, Uchicago (.25) Raj Kettimuthu, GridFTP, Uchicago (.25) Tom Howe, Registry, Uchicago (.5) PSC (1.25) TACC (.5) Systems Operational Support – 12 FTEs Stephen McNally, NICS (.5) Mike Lowe, IU (1) Justin Miller, IU (1) Nada Cagle, NCSA (1) Mark Fredericksen, NCSA (1) Mike Pingleton, NCSA (1) Frank Wells, NCSA (1) Rolf Wilson, NCSA (1) Tom Johnson, IU (.5) Dave Lifka, Cornell (.25) Tim Bouvet, NCSA (.25) Wayne Louis Hoyenga, NCSA, (.25) Rick Mohr, NICS (.5) Dave Carver, TACC (.75) Leo Carson, SDSC (.5) Shava Smallen, SDSC (.5) Tom Howe, Iaas/SaaS, UChicago (.5) Byron Gill, PSC (.1) Anjana Kar, PSC (.2) Kevin Sullivan ,PSC (.1) Jared Yanovich, PSC (.1) Security – 4.25 FTEs Randy Butler, NCSA (.25) Jim Marsteller, PSC (.5) Adam Fest, PSC (.5) Nathaniel Mendoza, TACC (.75) Victor Hazlewood, NICS (.5) Ryan Braby, NICS (.5) James Barlow, NCSA (1) Jim Basney, NCSA (.25) Networking – 3.25 FTEs Linda Winkler, UChicago (.25) Paul Wefel, NCSA (.25) Matt Ezell, NICS (1) Kathy Benninger, PSC (.5) Chris Rapier (.25) Joe Lappa, PSC (.5) William Jones, TACC (.5) Data Services – 2.25 FTEs Chris Jordan, TACC (.25) Jack Kordas, Uchicago (.5) Chad Kerner, NCSA (.25) Rick Mohr, NICS (.5) Josephine Palencia, PSC (.5) Tomislav Urban, TACC (.25) Accounting and Account Management – 1.5 FTEs Steve Quinn, NCSA (.5) Ester Soriano, NCSA (.75) Ed Hanna, PSC (.25)
Deliverables and Goals • Security Deploy XSEDE Certificate Authority, deploy two factor authentication service, federate two factor authentication with BW, perform campus bridging with InCommon, provide security auditing services for XSEDE connected hosts, coordinate resource intrusion events; • Data Services Deploy XSEDE-wide parallel file system, coordinate data movement and management services, and develop a framework for distributed archival replication; • Networking Facilitate end-to-end performance for users, transition to XSEDEnet, peer with R&E network;
Deliverables and Goals • Software Support Deploy and perform acceptance testing of new capabilities and services into the production XSEDE environment, provide feedback to developers; • Accounting and Account Management Maintain current TG automatic distributed accounting and account management service, streamline account creation process, improve user access to stats; • Systems Operational Support Provide frontline user support, systems administration for all centralized XSEDE services and monitoring through the 24x7 XSEDE Operations Center
Operational Metrics Cybersecurity • Security events, logins and login types, security items deployed, security awareness training events Data management and coordination • wide area parallel file system usage and uptime Networking • Network uptime and usage Software maintenance and coordination • Software deployment issues and resolution
Operational Metrics – cont’d Accounting and account management • Account creation time for PI and non-PI(Goal: Decrease account creation time to within five business days) System operational support • Deliver 95% uptime on critical centralized services • Respond meaningfully to all tickets within 24 hours • Close 80% of all tickets within two business days
Review of activities to July 1 1.1.3.1 Deploy grid middleware infrastructure 1.1.3.3 Deploy account management software 1.1.3.4 Deploy information services infrastructure 1.1.3.5 Deploy common user environment 1.1.3.6 Deploy system of systems test environment 1.1.4.2 Deploy XSEDE website servers 1.2.1.1 Coordinate XSEDE security incident response 1.2.4.1 Test XSEDE software 1.2.6.1 Setup XSEDE Operations Center 1.2.3.1 Transition to XSEDEnet 1.3.2.1 Setup and populate XSEDE.ORG DNS
Review of activities to July 1 (continued) 1.2.6.5 Migrate AMIE to stand alone server off of XDCDB at both primary and secondary 1.2.6.5 Upgrade XDCDB hardware at SDSC 1.3.2.1 Deploy XSEDE User Portal (XUP) servers
Preview of year 1 activities 1.1.3.2 Deploy data management software 1.2.1.1 Deploy XSEDE Certificate Authority (CA) 1.2.1.2 Develop security awareness program 1.2.1.3 Deploy security authentication program 1.2.1.4 Deploy security tools 1.2.1.5 Deploy security infrastructure 1.2.1.6 Deploy InCommon authentication service 1.2.2.1 Deploy global parallel file system 1.2.2.2 Design archival replication framework
Ongoing 1.2.3.1 Maintain and monitor XSEDEnet 1.2.3.2 Tune end-to-end performance 1.2.4.1/2 Test and deploy XSEDE software 1.2.5.1 Maintain accounting and account management databases 1.2.5.2 Provide usage reports 1.2.6.1 Provide frontline user support 24x7 XSEDE Operations Center (XOC) 1.2.6.2 Deploy and support XSEDE system infrastructure 1.2.6.3 Support deploy security tools/infrastructure 1.2.6.4 Report operational metrics (yearly)
DNS transition plan • Ops Networking leading the DNS transition • xsede.org primary service moving to NCSA, backup at TACC • Delegation of {site}.xsede.org to sites • XSEDE staff should review DNS needs • Determine teragrid.org entries to duplicate • Determine new xsede.org entries • Review and coordinate with XSEDE L3 manager • XSEDE L3 Manager or delegate submits dns requests in TG help ticket