110 likes | 343 Views
CITI. Status Update for ECC Reporting Project August 21, 2009. June 4 th : EMC Commitment to CITI. June 4 th Meeting with Howard Elias, Tony DiSanto, Paul Stemmler and Gennevieve Schimpfle ECC Use Cases: scheduled Resource to Assist Reconciling Reports: in process
E N D
CITI Status Update for ECC Reporting Project August 21, 2009
June 4th: EMC Commitment to CITI • June 4th Meeting with Howard Elias, Tony DiSanto, Paul Stemmler and Gennevieve Schimpfle • ECC Use Cases: scheduled • Resource to Assist Reconciling Reports: in process • Agentless Technology Review: reviewed in context of SRM7 • On Site Senior Technical Resource: on site • Future-Proofing: underway for 6.1 planning • ER Tracking: underway for quarterly review • Other Commitments • Customer Service Improvements: PREM commitment solidified • CLARiiON RAID6 & CLARiiON Meta Lun Data Inaccuracies: HF provided July 30 • Host Agent Discovery Issues: under review for Root Cause and permanent fix
Baseline & Current Reporting Environment • Baseline Status as of July 31, 2009 • 7007 Hosts that should report on storage utilization • 5857 Hosts reporting • 1150 Hosts that were not reporting • Breakdown of Issues identified • Crosses RAID, Tier or array/internal boundaries; FS or DB on unknown storage • Last Discovery > 8 days ("LDT_NC" = Last Discovery Time / Not Current) • No Host Agent (or No ESX Agent or No VMware Agent) • Inactive Agent - Master/Host Agents that are down. • HDS/SP4 or Solaris/SP4 - Host Agents need to be upgraded. • Unsupported file-system type • Current Status as of August 21, 2009 • 7007 Hosts that should report on storage utilization • 6561 Hosts Reporting • 446 Hosts that were not reporting
ECC Reporting Approach #1: Short Term Goal: FMO for ICG Report for Tony DiSanto Execute Plan – Modify Reports, Remediate Agents 7/30 through 8/5 Report 8/6 Tweaking Reports - 8/11 #2: Establish a Repeatable Process to maintain LDT current Continue remediation and document process for maintaining LDT current Already in Process Deliver By 8/31 #3: RCA and Recommendations for Long-Term Stability Document Findings and Recommendations Already in Process Deliver By 9/30 Note: “Deliver by” provides a final delivery date; could be delivered earlier.
Short Term Goal: ICG Business Unit FMO Baseline as of July 31, 2009 2433 Hosts that should report on storage utilization 1715 Hosts reporting 718 Hosts that were not reporting Current Status as of August 21, 2009 2433 Hosts that should report on storage utilization 2283 Hosts Reporting 150 Hosts that were not reporting Additional Information: 150 ICG Hosts could not be corrected without direct access to the physical hosts. This would require engagement of Citi SA’s. 73% of ICG Allocated Storage is now being report correctly for Storage Utilization Total capacity being reported on is 1.1 PB. This an increase of 80TB of storage on 8/6/2009
Establish a Repeatable Process:Maintain LDT current • Process was documented and execution began on August 10th • Citi-wide remediation project for hosts • 2,257 Hosts Identified with Agent Discovery Problems site wide • 1150 Hosts have been processed using the batch deletion script • Results: • 704 (62%) Hosts are reporting current discoveries after action by EMC • 446 (38%) Hosts will require individual remediation by CITI SA’s with EMC • 1107 Have not been attempted, due to CITI’s own restrictions • Hosts contain database agents and hosts that are determined to be critical • Purpose is to extend the short term remedial action • EMC will have up to 2 resources at CITI to continue agent remediation through the remainder of the Hosts as directed by CITI • This team will also work as much as possible with the CITI SA’s to get the Agents reinstalled on the host which can not be restarted
Establish a Repeatable Process:Maintain LDT current, cont. • Repeatable Process Includes: • Achieve host storage utilization reporting by restarting the host agents • Using the Master Agent Restart Script provided to remediate the host • Master Agent Restart Script Handles • MO deletion from CC Repository • MO with multiple domains • Rediscover of Databases on remediated hosts • AAD DB batch load script (Provided by EMC Engineering) • Provides the ability to feed the database discovery information from a file • Allows for bulk processing to discovery DB’s • Identification of physical servers which need Citi SA intervention
Long-Term Stability: RCA & Recommendations • EMC to Document Findings and Recommendations • This will cover: • Use Case Analysis • Gap Analysis • RCA (Root Cause Analysis) • Findings will explain why CITI is experiencing Problems • This will be based on the Use Case Analysis, Gap Analysis, and first-hand experience while working issues and interviewing the Cit Dev and OPS staff. • Recommendations will cover • Bugs in ControlCenter • Enhancements to ControlCenter • Changes to CITI operations procedures • Changes to CITI Development
Moving Forward • Continue to work the host agent issues with the Agent Remediation process • Address the Host manifest issue with the Citi SA group which has allowed for a pilot to happen for Pushing Agents patches and upgrades from the ControlCenter Console • Will stream line the agent remediation process and all ongoing ControlCenter Agent activity • Identified causes of the Citi Agent problems (Some are still under investigation) • Windows Host Agents • Most of the installs of ControlCenter Agents on Windows system are corrupt • Cause has been identified as how Citi handle fixing server agent problem by SA’s • EMC has addressed most of this problem with the latest ControlCenter 6.1 UB Native Install Packages • Agents are identified with different Server FQDN • Still under investigation • Assumption is – different agents are finding different FQDNs for the host it is installed on because Citi has three different sources for resolving Host FQDN. • These different sources are not all in sync to resolve with the same Host FQDN
Getting ControlCenter 6.1 • Two-day planning meeting was held August 19 & 20 which addresed: • In-Depth planning on Citi Agent Packaging • Discussion and requirements for Solutions Enabler • Citi Certification Process • Readiness requirements for First ControlCenter 6.1 Upgrade • Citi Agent Wrapper, how and what is needed • Citi requirement to get all pre ControlCenter 6.0 agents upgraded to 6.0 agents • Meeting planned for August 27 for John Maloney with EMC Engineering • Ability to meet the Vmax deployment into Citi • Discussed need to ControlCenter during Citi development period • Establish first ControlCenter 6.1 deployment date (10/30) which was inline with the Vmax schedule into Citi production