1 / 74

Smart Storage and Linux An EMC Perspective

Smart Storage and Linux An EMC Perspective. Ric Wheeler ric@emc.com. Why Smart Storage?. Central control of critical data One central resource to fail-over in disaster planning Banks, trading floor, air lines want zero downtime Smart storage is shared by all hosts & OS’es

Download Presentation

Smart Storage and Linux An EMC Perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Smart Storage and LinuxAn EMC Perspective Ric Wheeler ric@emc.com

  2. Why Smart Storage? • Central control of critical data • One central resource to fail-over in disaster planning • Banks, trading floor, air lines want zero downtime • Smart storage is shared by all hosts & OS’es • Amortize the costs of high availability and disaster planning over all of your hosts • Use different OS’es for different jobs (UNIX for the web, IBM mainframes for data processing) • Zero-time “transfer” from host to host when both are connected • Enables cluster file systems

  3. Data Center Storage Systems • Change the way you think of storage • Shared Connectivity Model • “Magic” Disks • Scales to new capacity • Storage that runs for years at a time • Symmetrix case study • Symmetrix 8000 Architecture • Symmetrix Applications • Data center class operating systems

  4. Traditional Model of Connectivity • Direct Connect • Disk attached directly to host • Private - OS controls access and provides security • Storage I/O traffic only • Separate system used to support network I/O (networking, web browsing, NFS, etc)

  5. Shared Models of Connectivity • VMS Cluster • Shared disk & partitions • Same OS on each node • Scales to dozens of nodes • IBM Mainframes • Shared disk & partitions • Same OS on each node • Handful of nodes • Network Disks • Shared disk/private partition • Same OS • Raw/block access via network • Handful of nodes

  6. New Models of Connectivity FreeBSD VMS Linux • Every host in a data center could be connected to the same storage system • Heterogeneous OS & data format (CKD & FBA) • Management challenge: No central authority to provide access control Solaris Shared Storage IRIX DGUX NT HPUX MVS

  7. Magic Disks • Instant copy • Devices, files or data bases • Remote data mirroring • Metropolitan area • 100’s of kilometers • 1000’s of virtual disks • Dynamic load balancing • Behind the scenes backup • No host involved

  8. Scalable Storage Systems • Current systems support • 10’s of terabytes • Dozens of SCSI, fibre channel, ESCON channels per host • Highly available (years of run time) • Online code upgrades • Potentially 100’s of hosts connected to the same device • Support for chaining storage boxes together locally or remotely

  9. Longevity • Data should be forever • Storage needs to overcome network failures, power failures, blizzards, asteroid strikes … • Some boxes have run for over 5 years without a reboot or halt of operations • Storage features • No single point of failure inside the box • At least 2 connections to a host • Online code upgrades and patches • Call home on error, ability to fix field problems without disruptions • Remote data mirroring for real disasters

  10. Symmetrix Architecture • 32 PowerPC 750’s based “directors” • Up to 32 GB of central “cache” for user data • Support for SCSI, Fibre channel, Escon, … • 384 drives (over 28 TB with 73 GB units)

  11. Symmetrix Basic Architecture

  12. Data Flow through a Symm

  13. Read Performance

  14. Prefetch is Key • Read hit gets RAM speed, read miss is spindle speed • What helps cached storage array performance? • Contiguous allocation of files (extent-based file systems) preserve logical to physical mapping • Hints from the host could help prediction • What might hurt performance? • Clustering small, unrelated writes into contiguous blocks (foils prefetch on later read of data) • Truly random read IO’s

  15. Symmetrix Applications • Instant copy • TimeFinder • Remote data copy • SRDF (Symmetrix Remote Data Facility) • Serverless Backup and Restore • Fastrax • Mainframe & UNIX data sharing • IFS

  16. “Race to Sunrise” 2 am 6 am Business Continuance Problem“Normal” Daily Operations Cycle Online Day BACKUP / DSS 4 Hours of Data Inaccessibility* Resume Online Day

  17. TimeFinder • Creation and control of a copy of any active application volume • Capability to allow the new copy to be used by another application or system • Continuous availability of production data during backups, decision support, batch queries, DW loading, Year 2000 testing, application testing, etc. • Ability to create multiple copies of a single application volume • Non-disruptive re-synchronization when second application is complete BUSINESS CONTINUANCE VOLUME PRODUCTION APPLICATION VOLUME Sales Backups Decision Support Data Warehousing Euro Conversion PRODUCTION APPLICATION VOLUME BUSINESS CONTINUANCE VOLUME PRODUCTION APPLICATION VOLUME BUSINESS CONTINUANCE VOLUME BCV is a copy of real production data

  18. Business Continuance Volumes • A Business Continuation Volume (BCV) is created and controlled at the logical volume level • Physical drive sizes can be different, logical size must be identical • Several ACTIVE copies of data at once per Symmetrix

  19. M1 M2 BCV Using TimeFinder • Establish BCV • Stop transactions to clear buffers • Split BCV • Start transactions • Execute against BCVs • Re-establish BCV

  20. UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED Re-Establishing a BCV Pair Split BCV Pair • BCV pair “PROD” and “BCV” have been split • Tracks on “PROD” updated after split • Tracks on ‘BCV’ updated after split • Symmetrix keeps table of these “invalid” tracks after split • At re-establish BCV pair, “invalid” tracks are written from “PROD” to “BCV” • Synch complete BCV M1 PROD M1 Re-Establish BCV Pair BCV PROD

  21. UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED UPDATED Restore a BCV Pair Split BCV Pair • BCV pair “PROD” and “BCV” have been split • Tracks on “PROD” updated after split • Tracks on “BCV” updated after split • Symmetrix keeps table of these “invalid” tracks after split • At restore BCV pair, “invalid” tracks are written from “BCV to PROD” • Synch complete BCV PROD Restore BCV Pair BCV PROD

  22. M1 M2 BCV 1 BCV 2 BCV 3 Make as Many Copies as Needed • Establish BCV 1 • Split BCV 1 • Establish BCV 2 • Split BCV 2 • Establish BCV 3 4 PM 6 PM 5 PM

  23. The Purpose of SRDF • Local data copies are not enough • Maximalist • Provide a remote copy of the data that will be as usable after a disaster as the primary copy would have been. • Minimalist • Provide a means for generating periodic physical backups of the data.

  24. Synchronous Data Mirroring • Write is received from the host into the cache of the source • I/O is transmitted to the cache of the target • ACK is provided by the target back to the cache of the source • Ending status is presented to the host • Symmetrix systems destage writes to disk • Useful for disaster recovery

  25. Semi-Synchronous Mirroring • An I/O write is received from the host/server into the cache of the source • Ending status is presented to the host/server. • I/O is transmitted to the cache of the target • ACK is sent by the target back to the cache of the source • Each Symmetrix system destages writes to disk • Useful for adaptive copy

  26. Backup / Restore of Big Data • Exploding amounts of data cause backups to run on too long • How long does it take you to backup 1 TB of data? • Shrinking backup window and constant pressure for continuous application up-time • Avoid using production environment for backup • No server CPU or I/O channels • No involvement of regular network • Performance must scale to match customer’s growth • Heterogeneous host support

  27. Fastrax Overview Fibre Channel PtP Link(s) Tape Library Fastrax Data Engine SCSI BCV2 UNIX R1 R2 SCSI STD2 STD1 BCV1 Linux Tape Library UNIX Location 2 Location 1 Fastrax EnabledBackup/RestoreApplications SYMAPI

  28. Fastrax Symmetrix Tape Library Host Host to Tape Data Flow

  29. RAF RAF DM DM SRDF DM RAF DM RAF Fastrax Fastrax Performance • Performance scales with the number of data movers in the Fastrax box & number of tape devices • Restore runs as fast as backup • No performance impact on host during restore or backup

  30. Moving Data from Mainframes to UNIX

  31. InfoMover File System • Transparent availability of MVS data to Unix hosts • MVS datasets available as native Unix files • Sharing a single copy of MVS datasets • Uses MVS security and locking • Standard MVS access methods for locking + security

  32. Minimal Network Overhead -- No data transfer over network! -- MVS Data IFS Implementation • Mainframe • IBM MVS / OS390 • Open Systems • IBM AIX • HP HP-UX • Sun Solaris ESCON Channel Parallel Channel FWD SCSI Ultra SCSI Fibre Channel Symmetrix with ESP

  33. Symmetrix API’s

  34. Symmetrix API Overview • SYMAPI Core Library • Used by “Thin” and Full Clients • SYMAPI Mapping Library • SYMCLI Command Line Interface

  35. Symmetrix API’s • SYMAPI are the high level functions • Used by EMC’s ISV partners (Oracle, Veritas, etc) and by EMC applications • SYMCLI is the “Command Line Interface” which invoke SYMAPI • Used by end customers and some ISV applications.

  36. Basic Architecture User access to the Solutions Enabler is via the SymCli or Storage Management Application Other Storage Management Applications Symmetrix Command Line Interpreter (SymCli) Symmetrix Application Programming Interface (SymAPI)

  37. Client Host Storage Management Applications Server Host SymAPI Client SymAPIlibrary SymAPI Server Thin Client Host Storage Management Applications Thin SymAPI Client Client-Server Architecture • Symapi Server runs on the host computer connected to the Symmetrix storage controller • Symapi client runs on one or more host computers

  38. SymmAPI Components Initialization InfoSharing Gatekeepers Calypso Controls Discover and Update Optimizer Controls Configuration DeltaMark Functions Device Groups SRDF Functions Statistics TimeFinder Functions Mapping Functions Base Controls

  39. Data Object Resolve RDBMS Data File File System Logical Volume Host Physical Device Symmetrix Device Extents

  40. File System Mapping • File System mapping information includes: • File System attributes and host physical location. • Directory attributes and contents. • File attributes and host physical extent information, including inode information, fragment size. I-nodes Directories File extents

  41. Data Center Hosts

  42. Solaris & Sun Starfire • Hardware • Up to 62 IO Channels • 64 CPU’s • 64 GB of RAM • 60 TB of disk • Supports multiple domains • Starfire & Symmetrix • ~20% use more than 32 IO channels • Most use 4 to 8 IO channels per domain • Oracle instance usually above 1 TB

  43. HPUX & HP 9000 Superdome • Hardware • 192 IO Channels • 64 CPU’s cards • 128 GB RAM • 1 PB of storage • Superdome and Symm • 16 LUNS per target • Want us to support more than 4000 logical volumes!

  44. Solaris and Fujitsu GP7000F M1000 • Hardware • 6-48 I/O slots • 4-32 CPU’s • Cross-Bar Switch • 32 GB RAM • 64-bit PCI bus • Up to 70TB of storage

  45. Solaris and Fujitsu GP7000F M2000 • Hardware • 12-192 I/O slots • 8-128 CPU’s • Cross-Bar Switch • 256 GB RAM • 64-bit PCI bus • Up to 70TB of storage

  46. AIX 5L & IBM RS/6000 SP • Hardware • Scale to 512 Nodes (over 8000 CPUs) • 32 TB RAM • 473 TB Internal Storage Capacity • High Speed Interconnect 1GB/sec per channel with SP Switch2 • Partitioned Workloads • Thousands of IO Channels

  47. IBM RS/6000 pSeries 680 AIX 5L • Hardware • 24 CPUs • 64-bit RS64 IV • 600MHz • 96 MB RAM • 873.3 GB Internal Storage Capacity • 53 PCI slots • 33 – 32bit/20-64bit

  48. Really Big Data • IBM (Sequent) NUMA • 16 NUMA “Quads” • 4 way/ 450 MHz CPUs • 2 GB Memory • 4 x 100MB/s FC-SW • Oracle 8.1.5 with up to 42 TB (mirrored) DB • EMC Symmetrix • 20 Small Symm 4’s • 2 Medium Symm 4’s

  49. Windows 2000 on IA32 • Usually lots of small (1u or 2u) boxes share a Symmetrix • 4 to 8 IO channels per box • Qualified up to 1 TB per meta volume (although usually deployed with ½ TB or less) • Management is a challenge • Will 2000 on IA64 handle big data better?

  50. Linux Data Center Wish List

More Related