190 likes | 249 Views
Explore Yahoo's storage consolidation journey, from legacy architecture challenges to new solutions that boost reliability and scalability while reducing management overhead. Learn about the benefits, drawbacks, and outcomes of adopting NAS and SAN for improved data management.
E N D
Case Study - Storage Consolidation Steve Curry Yahoo Inc.
About Yahoo! Quick Stats 300+ million registered users 2 billion page request per day 25 countries, 14 languages 500TB data on disk 1PB data on tape
Yahoo! Storage Operations Responsibilities All US storage administration Data archiving / backups US/Global storage architecture / standards 2nd tier support for global operations Tool development 24/7 global issue/outage response Reporting
Case Study #1 – Y! Photos/Briefcase • Online photo album Case Study #1 • Online file storage
Case Study #1 – Y! Photos/Briefcase Legacy Architecture Cheap… *repeat* cheap JBOD’s Single host support JBOD array A/B mirror for redundancy FreeBSD OS 150TB of content Custom apps
Case Study #1 – Y! Photos/Briefcase …Legacy Architecture Advantages Low cost hardware Extremely distributed Disadvantages Not very scalable Management headache No longer meets reliability requirements
Case Study #1 – Y! Photos/Briefcase …Legacy Architecture Management Issues Management is per host (over 160 storage hosts) Synchronous mirror between A/B pair No “Hot-Swap” support Single spindle performance
Case Study #1 – Y! Photos/Briefcase This… X 12! Single tier, single spindle performance.
Case Study #1 – Y! Photos/Briefcase Consolidation Plan NAS or SAN? Requirements Reliability Scalability Reduce management overhead Considerations Current hardware investment Application support
Case Study #1 – Y! Photos/Briefcase Network Attached Storage Solution Management Filers are heavily deployed Smart appliance Suite of tools already developed for filers Advantages RAID redundancy Multi-spindle performance Takes advantage of existing hardware Ease of application port
Case Study #1 – Y! Photos/Briefcase …Network Attached Storage Solution Disadvantages Initial cost of deployment (cutover, SCSI –vs- IDE) Lot’s of JBOD’s to get rid of! ;-)
Case Study #1 – Y! Photos/Briefcase New Architecture NAS solution FreeBSD app servers Load balanced 10 storage hosts Point in time snapshots Dedicated SAN backup fabric Distributed-farm model
Case Study #1 – Y! Photos/Briefcase Simple 2 tier model. Scalable, redundant, multi-spindle RAID performance, hot-swap support.
Case Study #1 – Y! Photos/Briefcase Consolidation Wins! Cost considerations Performance Backups Management High availability Hot swap
Case Study #2 - Data Mining Case Study #2 Global data mining Global log collection
Case Study #2 - Data Mining Current Architecture DAS attached arrays Custom scripts Stacker type tape libraries Single-tier disk storage
Case Study #2 - Data Mining Management Issues Large storage host count Many small tape libraries No redundancy Does scale for future requirements
Case Study #2 - Data Mining Storage Requirements High write performance Data growth 2TB per day!! Store data on disk for 30 days Archive to tape Consolidation Considerations Reduce host management Create a multi-tier storage architecture Consolidate to one large tape library Increase write performance
Case Study #2 - Data Mining • Common Y! model • Multi-tier storage • Scalable