1 / 30

Intelligent Archiving Strategies: Toward ILM

Intelligent Archiving Strategies: Toward ILM. Arun Taneja, Founder and Consulting Analyst, Taneja Group Alex Gorbansky, Senior Analyst, Taneja Group. Agenda. A Bit of Historical Perspective Why Archive? What to Archive? The ILM Panacea Developing an Operational Archival Strategy

Download Presentation

Intelligent Archiving Strategies: Toward ILM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intelligent Archiving Strategies: Toward ILM Arun Taneja, Founder and Consulting Analyst, Taneja Group Alex Gorbansky, Senior Analyst, Taneja Group

  2. Agenda • A Bit of Historical Perspective • Why Archive? • What to Archive? • The ILM Panacea • Developing an Operational Archival Strategy • Key Considerations • Representative Vendors and Solutions • Conclusions

  3. Archival ≠ Backup BACKUP Copying production data to an alternative medium for restorability in the event of data loss, corruption, or unavailability. ARCHIVAL Retention of historical data for future access for business reasons such as audits, customer issues, or litigation.

  4. Some History On Archiving American Historical Association • Archival standards • Marriages • Businesses American colonists • Births • Marriages • Businesses Ancient Egypt: • Library of Alexandria • Engravings 3000 BCE Middle Ages 1600s 1789 1884 Shift from Feudalism To Nation State: • Records • Property rights French Revolution • Property records

  5. Archival Business Drivers Today REGULATORY COMPLIANCE REQUIREMENTS EXPLOSIVE DATA GROWTH APPLICATION PERFORMANCE DEGRADATION RISING COSTS

  6. What to Archive? • Structured Data: • ERP/CRM DB tiers • Business transactions • Unstructured Data: • Documents • X-Rays • Check Images • Voice recording • Semi-structured Data: • Email • Instant Messaging

  7. ILM…ShmILM • “ILM” is an abstract framework for describing the processes and technology used to manage information throughout its life according to its business value. • “ILM” is NOT the panacea for your storage management challenges.

  8. Archival is a key component of what vendors are calling “ILM Applications: ERP, CRM, Email, Call Recording, Image Access Application Data: Structured, Unstructured, Semi-Structured Policies and Rules Business Context Referential Integrity Regulatory Compliance Data Movement Technologies Snapshots Replication Backup Archival HSM Storage Infrastructure Tiers Primary Secondary Tertiary

  9. Developing an Archival Strategy • PLAN • When/How • Data Classification • Requirements 4. REPORT & TEST 2. DESIGN 3. IMPLEMENT

  10. Why Plan and When to Start • Upfront Planning will Result in Significant Benefits in Future Phases. • Develop an Archival Strategy as part of your application design and development process. • Engage Key Stakeholders: • Application Owners • Business Decision Makers: Compliance Officers, Legal • Identify Key Archival Business Drivers: • Regulatory Compliance • Other: Data Growth, Increasing Costs, Poor Performance

  11. The Data Classification Puzzle • Assess the application data in your shop according to the following categories: • Structured: database • Unstructured: files, videos, images • Semi-structured: email • Identify specific data sets impacted by regulatory compliance: • Examples: Email, Medical Records, Call Recordings

  12. Requirements Definition • Engage Application Owners • Compliance not the ONLY archival driver • Separate requirements processes for applications impacted by compliance. Compliance-specific: • Retention period • Media characteristics • Data restorability rates • Access control policies • Data availability/DR General archival: • Data Access Patterns • Restore time requirements • Application performance • Cost structure • Access control policies • Data availability/DR

  13. Taming the Compliance Monster • Understand the Regulations: Significant Variance by Industry • Assess/Communicate Requirements to Key Business Stakeholders • Judge Products for Yourself – Just because a vendor says a solution is “Compliant” doesn’t make it so. • Stay abreast of changes in regulatory mandates.

  14. Defining Key Archival Metrics • Archive Distribution Percentages Across: • Online: Disk, Object-based storage • Near-line: Optical, Tape (local) • Off-line: Off-site vaults • Number of data copies • Local • Remote

  15. Designing an Archival Solution • Requires an application specific assessment – look for commonality in application requirements • Wholly enterprise-wide strategies will be difficult to build and sustain • Evaluate alternative solutions based on application requirements and metrics

  16. Don’t Ignore the Organizational Dynamics • Archival Touches Multiple Organizations: • IT – Applications • IT – Infrastructure • Legal • Users • Consequences of mistakes are enormous: • Fines • Litigation • Consider organizing a cross-functional team led by an archival champion with a combination of technical and business expertise

  17. Comprehensive Application Assessment • Data Classification Exercise • Data Set Size and Historical and Predicted Data Growth Rates based on business drivers • Is Regulatory Compliance an Issue? • Data Valuation over Time: • Access patterns of data of 90 days old and beyond. • Cost of data loss • Going it alone can be difficult • Available resources: • Services organizations: GlassHouse, Accenture, EDS, Storage Vendor • Application Management Tools: File-Level SRM, Precise • Budgetary Requirements

  18. Components of the Archival Stack Application Data • Application Specific Module • Discovery and analysis of data assets • Business rules and policies definitions • Identification and movement of specific data to • appropriate storage medium • Management, indexing of data and metadata • Access control mechanism Management & Control Data Flow • Storage Infrastructure • Physical archive repository • Data Preservation and Protection • Indexing Technologies for Retrieval Physical Repository

  19. Structured Data Archival Challenges to Investigate • ERP deployments are still very nascent • Preventing application downtime during archival • Preserving referential data integrity: • Archival of core data and associated data in other tables • Enforcing single read-only state across related data • Delivering transparent access to archived/combined data via native app UI • Maintaining performance of remote queries and union views. • Update process: • Restate vs. entire reload

  20. Unstructured Data Considerations • Scalability • Sustained performance with data growth • Hierarchical file-systems limited at large scales • Content Access and Visibility • Meta data use to intelligently manage and maintain archive addresses traditional file system limitations • Scalability of Index (Content addresses)

  21. Email Archival Challenges • Stringent regulations: SEC Rule 17A-4 • Non-rewriteable, non-reusable media • Verification of writes • Serialize units of media • Solution Requirements • Server-based capture • Support for multiple distributed Email Servers

  22. Meta Data Holds Real Value • Object Age and creation date • Object Change History • Associated application/users • Access control • Priority/Criticality • Data Access/Frequency Meta Data is data about data • Digital asset tied to specific infrastructure • No value outside of infrastructure context Traditional File Systems • Self-describing attributes for digital asset • Enables powerful policy-based data movement applications Object-based systems

  23. Amount of Data D2D Systems Object Storage Disk Systems Libraries Probability of Reuse Drives Choosing the Right Storage Medium 1 Week 1 Month 3 Months 1 Year 18 – 30 Years Life Expectancy Recovery Time Minutes Hours to Days < Seconds

  24. Key Considerations for Storage Media • Cost • Access time • Application access method: • NFS/CIFS • Application-specific API • Reliability/Availability • Data Preservation Capability • Scalability • Archival solution integration

  25. Storage Media Considerations

  26. Shifting towards an On-line Model Tape Primary Object Storage SATA

  27. Representative Vendors Start with your application vendor

  28. Trust But Verify • Develop processes to periodically access historical data to test: • Data integrity • Access time • Manage capacity growth using vendor-supplied reporting tools

  29. Summary • Archival is not backup and is not just about compliance • Successful strategy requires application-centric approach • Engage with key corporate stakeholders to define requirements and select solutions • Look for automated and interoperable software and hardware modules. • Be Paranoid!

  30. Thank you! • Arun Taneja arunt@tanejagroup.com • Alex Gorbansky alex@tanejagroup.com

More Related