1 / 22

Data De-Duplication VMUG Dallas

Data De-Duplication VMUG Dallas. March 26, 2008. Kyle Green Director, South Central U.S. 972-768-4896 kgreen@datadomain.com. Where are we?. Where are we?. LZ Compression ~2x White space reduction. Storage Array 1:1. Single Instance Storage ~5x File level. Fixed Block ~8x

akamu
Download Presentation

Data De-Duplication VMUG Dallas

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data De-DuplicationVMUG Dallas March 26, 2008 Kyle Green Director, South Central U.S. 972-768-4896 kgreen@datadomain.com Confidential

  2. Where are we? Confidential

  3. Where are we? Confidential

  4. LZ Compression ~2x White space reduction Storage Array 1:1 Single Instance Storage ~5x File level Fixed Block ~8x Fixed blocks, snapshots Data Deduplication ~20x Hierarchy of Data Reduction Types • Data Deduplication Significantly Reduces • Power • Heat • Cooling • Management Confidential

  5. Data Domain: Leadership and Innovation • Deduplication Storage Systems • > 3,700 systems installed • > 1,500 customers • > 325 petabytes under Data Domain protection worldwide • A History of Industry Firsts 2005 2003 2004 2006 2007 First Dedupe NAS First Dedupe Gateway Largest Dedupe Array First Dedupe Volume Replication First Dedupe Directory Replication First Dedupe VTL First Dedupe Nearline Storage Confidential

  6. PRIMARY TAPE Storage 1.0 PRIMARY TAPE SATA & RAID Storage2.0 Deduplicated Storage Storage3.0 TAPE PRIMARY Storage 3.0 - The Long Term Play Confidential

  7. Key Attributes of Data Domain Technology • Easily Integrates with Existing Infrastructure • Retention: Deduplication • Recovery: Data Invulnerability Architecture • Replication: WAN Efficient Data Domain Deduplication Storage for Nearline Applications Confidential

  8. Challenges Massive data growth Economic pressures Regulatory compliance Challenges with tape Questionable reliability Mechanical failures DR via trucks Longer recovery times Today’s Data Protection Challenges • The Solution Confidential

  9. Easily Integrates with Existing Infrastructure No rip and replace. CIFS, NFS, NDMP Ethernet FC = VTL Replication 3U (15) 500 GB SATA drives RAID-6 NVRAM N+1 Fan 1 - 4 Ports 5.4 to 21.6 TB with Shelves File System (Gateway to: EMC, HDS, Nexsan, Pillar, NetApp, 3PAR) … plus other nearline applications Confidential

  10. Thurs Incr A C K Friday Full Backup Second Friday Full Backup Mon Incr A B H A B C D A E F G B C D E F L G H Tues Incr Weds Incr E C B G I J A B C D E F G Data Deduplication: Under the Hood Store more backups in a smaller footprint. BACKUP DATA LOGICAL ESTIMATED PHYSICAL REDUCTION FRIDAY FULL 1 TB 2- 4x 250 GB Monday Incr 100 GB 7-10x 10 GB Tuesday Incr 100 GB 7-10x 10 GB Wednesday Incr 100 GB 7-10x 10 GB Thursday Incr 100 GB 7-10x 10 GB 2nd FRIDAY FULL 1 TB 50-60x 18 GB TOTAL 2.4 TB 7.8x 308 GB H I J K L Confidential

  11. Longer Retention: Store More with Less Over 1 year of retention in 3µ of Data Domain protection storage. BACKUP DATA LOGICAL ESTIMATED PHYSICAL REDUCTION Week 1 April 7 2.4 TB 8x 308 GB Week 2 April 14 3.8 TB 10x 366 GB Week 3 April 21 5.2 TB 12x 424 GB Month 1 April 28 6.6 TB 14x 482 GB Month 2 May 31 12.2 TB 17x 714 GB Month 3 June 30 17.8 TB 19x 946 GB Month 4 July 31 23.4 TB 20x 1178 GB TOTAL 23.4 TB 20x 1178 GB Confidential

  12. Inline Deduplication for Optimized Time-to-DR • Post-process DR restore point is usually obsolete Additional 2-3x backup time to get to DR Ready Backup Window Data Domain Inline Dedupe/ Replication DR-Ready Replicate During Backup VTL/Tape/Truck DR-Ready Backup to VTL Copy to Tape Truck to DR Site Post-Process Dedupe Backup to Cache Dedupe & Replicate DR Ready Confidential

  13. In Line vs Post Process 5 TB Initial Full Backup @ 2:1 Deduplicated inline @ 60MB/s – 2.5T written 5 TB Initial Full Backup @ 2:1 Deduplicated Post Process @ 30MB/s – 2.5T cached to disk while 2.5T deduped to 1.25T Deduped Data 5 T Addressable 5 T Addressable 2.5TB Remaining 1.25T rem. Initial Full Cached Data 500 GB Incremental Backup @ 7:1 Deduplicated inline @ 60MB/s – 71 GB written Daily. 426 GB Total (6 Days Inc) 2.926 TB Total written to the system 500 GB Incremental Backup @ 7:1 Deduplicated Post Process @ 30MB/s – 250G Cached to disk while 250 deduplicated to 36GB. Remaining deduped after backup 2.926 T Total Written 5 T Addressable 5 T Addressable 2.074 TB Remaining 2.074 TB Remaining Initial Full Initial Full 5 TB Subsequent Full Backup @ 50:1 Deduplicated inline @ 60MB/s – 100GB written. 3.026 TB Total written to System. 5 TB Subsequent Full Backup @ 50:1 Deduplicated Post Process @ 30MB/s – 2.5T cached to disk while 2.5T deduped to 50 GB – OUT OF SPACE 5 T Addressable 5 T Addressable 1.974TB Remaining Initial Full 2.5T Needed. 2.0 t Avail Initial Full After 1 week retention a 5 TB post processing system is out of space for caching. All backups must slow to accommodate incoming data without caching. Confidential

  14. Recovery: Data Invulnerability Architecture Trust but verify – hope is not a strategy. Data Verification CheckSum Dedupe, write to disk Verify Self-healing file system Cleaning Expired data Defrag Verify Other RAID-6 NVRAM Snapshots Confidential

  15. DIR A home Replication: WAN Efficient True DR; lowers WAN costs; improves SLAs. 95- 99% Bandwidth Reduction 1- 5% Archive Data 1- 5% WAN Backup Data 1- 5% home Destination: Data Center Hub Backup Data Backup Data Backup Data Source: Remote Sites Confidential

  16. So … How does this work with VMware? Confidential

  17. Backing Up VMware to Data Domain Confidential

  18. “…is he still talking..?” - Summary Concepts • Data Domain enables NAS, (CIFS, NFS) NDMP & VTL backup targets for all virtualized applications • Drops into existing enterprise backup architectures • Works with Virtualized and Non-Virtualized environments • In 80/20 data centers, centralized capacity optimization provides single instance store across all applications and systems, virtual or actual • Back-up VMs to DDR with agent, or service console level • Choose to place an agent on critical VMs for file level restore • Choose to place an agent on the service console as well • Back-up all to same DDRs and watch compression happen • Consolidated back-ups sent from proxy to DDR • If you prefer an agent free virtual machine… • Global Rule: all data is compared to all other data in the DDR • Replicate all or some to anywhere, whenever, and back • DR, test, development, virtual application migration Confidential

  19. Retention/ Restore Backup Replication DR Data Domain: Dedupe Simplified Backup/mediaserver • High-speed, inline deduplication storage; disk target for nearline applications • Any leading backup software, archive apps, or custom nearline use • All data types: structured and file • Any fabric: NFS / CIFS / NDMP via Ethernet, or VTL via Fibre Channel • Disk storage: Internal, or gateway to SAN array • One dedupe infrastructure: remote office, datacenter with inline replication Archive to tape As required Offsite Disaster Recovery Storage ‘Drag&Drop’ Archiving Offsite Disaster Recovery Storage WAN Onsite Retention Storage Onsite Retention Storage Clients Server Primary storage Archive Application Server Data Domain Archive Confidential

  20. Summary: Key Attributes • Easily Integrates with Existing Infrastructure • No rip/replace • Retention: Deduplication for Nearline Applications • Store more backups and archived data in smaller footprint • Recovery: Data Invulnerability Architecture • Trust but verify – hope is not a strategy • Replication: WAN efficient • True DR • Lowers cost of WAN • Improves SLAs Confidential

  21. Summary: Simplifying Deduplication Storage • Lower TCO • Much lower cost for disk-based retention • Lower operational costs, smaller foot print • Neutral to price of tape automation • Low bandwidth for replication, DR • Faster • Handles variable streams smoothly, unlike tape • Better SLAs: Random access to restores and archives • Secure • Designed as store of last resort • No tapes on a truck • Simple • Set it and forget it • Any backup or archive software, any storage fabric, all data types Confidential

  22. Thank You Confidential

More Related