510 likes | 1.64k Views
EMC Centera Technical Review. Storage Costs. Information’s Business Value. EMC Centera for Information Governance and Compliance. “Purpose-built”, active archiving platform >5,500 customers and >370 PB shipped >11,000 systems shipped Assured content authenticity and online access
E N D
Storage Costs Information’s Business Value EMC Centera for Information Governance and Compliance • “Purpose-built”, active archiving platform • >5,500 customers and >370 PB shipped • >11,000 systems shipped • Assured content authenticity and online access • Highly available, high performance • Five-9s: No single point of failure • Integrity protection at all levels • Continuous disk scrubbing • Multiple file protection • Self-healing: files, data bases, disks
Content Authenticity Easy to manage Low TCO EMC Centera—An Archive Solution • Content authenticity ensures: • Internal storage—content/data faults detected and automatically healed • Network transmissions—transmission errors detected and transfer repeated • Easy to manage • Administrators can manage up to 50 times greater quantity of content • Works with any application or any platform • Centralize archive silos from multiple data repositories • Enable thousands of users to share single multi-application repository • Helps meet governance and compliance requirements
Four node 16-node cube 2 cubes/ cabinet Multiple cubes form a single cluster ARCHITECTURE Redundant Array of Independent Nodes (RAIN) Centera node • Storage nodes/access nodes • 2.8 GHz P4 processor • 1024 MB DDR RAM • Four - 1 TB or 2 TB SATA-II • Two 1 Gbit network-interfaces • One 1 Gbit to outside LAN (copper/optical) • A node can be in one of 3 modes • Access node • Storage node • Access/storage node Centera network • Dual 24-port cube switches • Gigabit Ethernet connections to facilitate additional cubes • Redundant connection to each node Extreme scalability • Massive parallel processing • Add storage: processing power, memory, bandwidth
Centera Stores and Retrieves Objects ARCHITECTURE Objects have metadata date name <My_Archiving_Application> <MagazineCover name=“Time” photo=“Annan” date=“Sep 4, 2000”/> <Reviewer name=“Jones, Ted”/> </ My_Archiving_Application> photo • Applications create metadata associated with one or more objects • Objects are stored independent of volume/directory information
Centera Requires No Backups ARCHITECTURE Centera • Dramatically reduces opportunities for errors to affect data access or authenticity • If an error occurs, it can be discovered and healed How? • Fixed content prevents data overwrites by applications • Content authenticity, independent copies, self-monitoring, self-healing • Detection and healing of bad disk blocks • Content regeneration from loss of entire disk • Detection and healing of FS errors • Content regeneration from total loss of FS Limited configuration • Human error cannot affect the archive filesystems or disks • No active management of these resources
Centera Failure and Self-Healing Model ARCHITECTURE Failure Detection Remedy Regenerate node Regenerate disk • Regenerate database • Regenerate blob • Regenerate filesystem Presence of node Presence of disk • Database health • Read/write errors Disk scrubbing Blobs Metadata Blocks • Connectivity • Software heartbeats Node failure Full disk failure • Database failure • Block failure Filesystem failure • Network failure • Software failure Restore data • Alert EMC
ARCHITECTURE Single Instance Storage Duplicate information stored only once. song G Regardless of how many copies of an object are sent to the Centera, the object is only stored a single time.
CA CA LAN Content Address ContentAddressalgorithm • Digital fingerprint • Globally unique • Location- independent 10001010 Content Addressalgorithm 10111011 ARCHITECTURE How Centera Works: Application Example Centera performs content address calculation and sends address back to application Object is created and sent to application server Application server sends object to Centera over IP network Application stores Content Addressfor future reference
Storage nodes • • • • • • • • • • Storage nodes Cube-switch Cube-switch Dual, self-managedprivate LAN Network switch Access/storage nodes • • Access/storage nodes Redundant power Content Protection Mirror ARCHITECTURE Œ Œ
Storage nodes • • • • • • • • • • Storage nodes Cube-switch Cube-switch Dual, self-managedprivate LAN Network switch Access/storage nodes • • Access/storage nodes Redundant power ARCHITECTURE Content Protection Parity
Storage nodes • • • • • • • • • • Storage nodes Cube-switch Cube-switch Dual, self-managedprivate LAN Network switch Access/storage nodes • • Access/storage nodes Redundant power ARCHITECTURE Regeneration—Self-Healing! Œ Œ Œ
Centera Monitor MANAGEMENT • Web-based (J2EE) • Properties view • Alert views—current and historic • Performance/event views • Capacity—current and historic • Trending
Centera: Low TCO MANAGEMENT No complex storage area networking management No file systemmanagement No LUN/RAID Group carving or allocation • Investment protection—multi-generation hardware support • One addressable pool—ingestion machine for content • Constant validation of content objects and structures
Default Pool Pool 1 Pool 2 Application Pool 1 Application Pool 2 Application Pool 3 Default Pool Pool 3 Blob CDF Multiple “Virtual Pools” in One Physical Cluster MANAGEMENT EMC Centera “Cluster Pool”
MANAGEMENT Universal Access Makes Archiving Easy A c t i v e A r c h i v e XAM FTP New NFS/CIFS HTTP EMC Centera API Emulation EMC Centera Anywhere, any time, any application, from virtually any platform
MANAGEMENT Centera Viewer
Centera Console EMC Centera Console is a web-based user interface which enables administrators to monitor their EMC Centera environment. It can be used to: Monitor EMC Centera alerts View capacity and performance data in real time or over a defined period View replication topologies and status Check the progress of self-healing tasks Export data to comma-separated-value (CSV) and HTML file formats MANAGEMENT
Rack 2 SCALABILITY Nondisruptive Scalability—Self Configuring! Upper cube Root switches Lower cube 4-nodeCentera IP Address Rack 1
SCALABILITY Centera Virtual Archive—Vision Information Policy Mgmt across a large virtual archive. Serving up authenticated & secure information wherever needed. Allowing Content to migrate freely in and out of the archive.
SCALABILITY Centera Virtual Archive: Breakthrough Technology • Ultimate scale, without disruption • Seamlessly aggregate multiple clusters • Virtualize new and existing clusters • Increase single view capacity (PBs) • Improved manageability at scale • Applications interact with a single virtual environment • Retrieve objects stored on any cluster • Better resource utilization of available capacity • Eliminate geographic boundaries • Overcome the limitations of space and distance Virtual Archive
Virtual Archive Federation Gen2/Gen3 CentraStar 4 Gen4 (LP) CentraStar 4 GenX CentraStarX ? Centera Virtual Archive software Centera Virtual Archive 1.0 SCALABILITY Technology agnostic Advancing the abstraction of archive implementation • Life time of a digital archive far exceeds the life of a computer technology; hardware, software or architecture • Centera/VE is software, hardware and architecture agnostic • Older and newer technology live concurrently in a Federation • Seamless CAS functionality for different technologies in one digital archive • Support for different Software versions
Centera Virtual Archive software Centera Virtual Archive 1.0 SCALABILITY Adding Virtual Archive to an existing cluster • Install the Centera cluster • Install the Virtual Archive software • For replication, install the target site in the similar way and enable replication • The application is communicating directly with the Virtual Archive software installed on the new cluster • Virtual Archive will redirect any traffic to the existing cluster as needed
Pool 1 Pool 1 Pool 1 Pool 2 Pool 2 Pool 3 Pool 3 Pool 3 AVAILABILITY Distributed Content for Business Continuity and Disaster Recovery • Pools and replication • Replicate selected Virtual Pools • Replicate all Virtual Pools Source Target
LAN LAN Application server Application server AVAILABILITY Centera Replication • Asynchronous over IP • Unlimited distance • Unidirectional, bidirectional, chain or star • Ability to “pause” replication • No host or human resources • People are not duplicating optical platters or worm tapes • The same content address exists in both clusters Router Router A study by the University of Texas reveals that only 2% of companies suffering from a catastrophic data loss survive after one year. The ability to resume normal operations and productivity rapidly is a critical business requirement.
COMPLIANCE Centera and Compliance Centera Basic • Provides all functionality without enforcement of retention periods Centera Governance Edition • Process-centric on the lifecycle of electronic records and enabling policies and technologies • Restricts the retention and deletion of data but does not conform to SEC regulations • Suitable for most regulations Centera Compliance Edition Plus (CE+) • Designed for the strictest of regulation requirements, specifically SEC 17a-4 • Restricts the retention and deletion of data according to SEC regulations
Set litigation hold Remove litigation hold CentraStar: Guaranteed Object Lifetimes COMPLIANCE CentraStar provides both fixed and variable retention periods, along with an on-demand legal-hold facility fixed retention deleteallowed C0 event not specified fixed retention event-based retention deleteallowed C1 fixed retention event not specified event-based retention deleteallowed C2 event not specified fixed retention event-based retention deleteallowed C3 Time C-Clip Created Event
Centera—Meeting the Needs of Today’s Production Archives • Centera delivers: • A multibillion object, long-term archive • Sub-second time to first byte • Assured lifetime content authenticity • Bulletproof content protection • Five-nines availability • Low TCO • Defacto standard: • Healthcare and e-mail archiving