1 / 20

Managing Explosive Data Growth

Managing Explosive Data Growth. EDUCAUSE 2008. PRESENTER : Ronan Glynn. Case Study: MSKCC. Established in 1884 9000+ Employees Regularly rated #1 US Cancer Hospital In 1948, Sloan-Kettering Institute formed 1500+ Employees SKI Computing Group formed in 2003. Our User Environment. 45% PC.

jcharlene
Download Presentation

Managing Explosive Data Growth

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Managing Explosive Data Growth • EDUCAUSE 2008 PRESENTER: Ronan Glynn

  2. Case Study: MSKCC • Established in 1884 • 9000+ Employees • Regularly rated #1 US Cancer Hospital • In 1948, Sloan-Kettering Institute formed • 1500+ Employees • SKI Computing Group formed in 2003

  3. Our User Environment 45% PC 55% Mac

  4. 2003: New Research Building Announced • Space for 100+ new labs • New Graduate School • All documents and materials to be online • At the time, many labs stored research locally

  5. 2003-2005: Growth Spurt • Scaled from 0TB-5TB-15TB • Architecture is FC disk in high-end enterprise array • Tape backup sets with off-site rotation • Still in process of migrating data online

  6. 2003-2005: Growth Spurt • Meet with labs & analyze their data usage patterns • Evaluate Vendor hardware, data sizes and projected growth • Learn about user tolerance for clients & “extra-clicks” • Storage budget is drawn up and analyzed

  7. Results of User/Data Analysis • Data is 99% Un-Structured • Used typically in a 1 year cycle • Use after 3 years is extremely low • Users do not know how much storage they need • Users want to minimize workflows • Data redundancy concept not being grasped

  8. Unique Situations • Labs survive on grant funding. If a grant runs out, the lab leaves MSKCC • 2 data sets need to be copied: 1 for MSKCC, and 1 for the departing lab • This created a need for near-line storage • Some lab equipment unsuitable for data storage, other labs have specific need for ultra-fast data writes • VIP users had to be catered for differently

  9. Research automation accelerates data growth Genescan Device High-Throughput Screening

  10. Data Growth vs. Drive Capacity Growth • Storage growth is outpacing storage budget growth by 150% per year for SKI • Drive capacity increases not keeping pace with data growth • Most apparent with Enterprise-class drives like FC, SCSI & SAS

  11. Searching for a solution • Must be Mac compatible • Client installs not feasible • Macs cannot use DFS, have issues with file locking and resource forks • No login scripts possible • Data archives a necessity

  12. 2006: New Building Opens • Zuckerman building opens • 23 stories (4 floors below ground). 558K sq. feet. • Deepest bedrock excavation in history of Manhattan (75 feet) • New labs bringing 1-5TB with them • Innovative Storage Tiering in place

  13. Data Storage Environment < 9 months 9 mo. - 3 yrs > 3 yrs.

  14. Storage Environment Details • TIER 1(50TB):EMC Celerra NAS Array w/ FC & high density SATA • TIER 2(32TB):EMC Centera: ATA based CAS solution. Auto-archives from tier 1 based on file age policy • TIER 3(10TB):Powerfile A3 Archiver. NAS enabled optical jukebox that utilizes 200 50GB ejectable Blu-Ray discs. • Unique Situations:Fast data writes handled by Isilon clustered NAS solution. VIP needs satisfied by back up televaulting using “Time Machine” and “Backup PC”. Time Machine is built into OSX 10.5, data is written to DAS storage attached to OSX Server. Backup PC is an open source backup solution for Windows & Linux, data is written to a share on Tier 1 storage array.

  15. Powerfile Blu-Ray Archiver • WORM long term archiving • Stores 50GB data on a Dual Layer disc • Low power & cooling costs • Blu-Ray now industry HD standard • Disc size set to grow to 100GB

  16. Isilon: Main Campus Storage • Main Campus storage required for labs requiring near real-time data writes to disk • Virtualized storage accommodates rapidly evolving storage growth • Isilon clustered NAS solution selected to meet these labs needs, with 33TB usable space • Fast data writes on private VLAN satisfies demand for high throughput on data copies

  17. Backup PC • Free Linux based solution • Schedule set by admin console • Backup by hostname • User can have their own restore interface • Compression & pooling features

  18. Time Machine • Account created on OSX Server • Point Time Machine to Mac Server • Incremental backups every hour (system files can be excluded) • Users set schedule and restore files themselves

  19. Questions

More Related