470 likes | 703 Views
Storage Technology and Management. W.lilakiatsakun. Storage Technology. JBOD (Just Bunch Of Disk) RAID (Redundant arrays of inexpensive disks) SSA (Serial Storage Architecture). JBOD (Just Bunch Of Disk) (1). JBOD (Just Bunch Of Disk) (2).
E N D
Storage Technology and Management W.lilakiatsakun
Storage Technology • JBOD (Just Bunch Of Disk) • RAID (Redundant arrays of inexpensive disks) • SSA (Serial Storage Architecture)
JBOD (Just Bunch Of Disk) (2) • JBODcanbeusedasindividualdisksoranyRAIDconfiguration or Concatenation (SPAN)dependingontheHostBusAdapter • ConcatenationorSpanningofdisksis a popularmethodforcombiningmultiplephysicaldiskdrivesinto a singlevirtualdisk. • Itprovidesnodataredundancy. • Disksaremerelyconcatenatedtogethersotheyappeartobe a singlelargedisk.
JBOD (Just Bunch Of Disk) (3) • Forexample, JBOD (Just a BunchOfDisks) couldcombine3GB, 15GB, 5.5GB, and12GBdrivesinto a logicaldriveat35.5GB, whichisoftenmoreusefulthantheindividualdrivesseparately.
Redundant arrays of inexpensive disks (RAID) • Theorganizationdistributesthedataacrossmultiplesmallerdisks, offeringprotectionfrom a crashthatcouldwipeoutalldataon a single, shareddisk. • Benefits are depended on level of RAID
RAID0 (stripe set or striped volume) • RAIDLevel0 splits data evenly across two or more disks (striped) with no parity information for redundancy. • It is important to note that RAID 0 provides zero data redundancy. • RAID 0 is normally used to increase performance • A RAID0 canbecreatedwithdisksofdifferingsizes, butthestoragespaceaddedtothearraybyeachdiskislimitedtothesizeofthesmallestdisk
RAID0 – Summary (1) • RAID 0 uses a very simple design and is easy to implement with a HUGE performance advantage. • I/O performance is greatly improved by spreading the I/O load across many channels and drives while the best performance is achieved when data is striped across multiple controllers with only one drive per controller.
RAID0 – Summary (2) • Noparitycalculationoverheadisinvolved • Not a "True" RAIDbecauseitisNOTfault-tolerant. • Thefailureofjustonedrivewillresultinalldatainanarraybeinglost.
RAID1 (mirrorring) • A RAID 1 creates an exact copy of a set of data on two or more disks. • This is useful when read performance or reliability are more important than data storage capacity. • Such an array can only be as big as the smallest member disk. • A classic RAID 1 mirrored pair contains two disks which increases reliability
RAID1 – Summary (1) • RAIDLevel1requires a minimumof2drivestoimplement. • 100redundancyofdatameansnorebuildisnecessaryincaseof a diskfailure, just a copytothereplacementdisk. • Transferrateperblockisequaltothatof a singledisk. • SimplestRAIDstoragesubsystemdesign.
RAID1 – Summary (2) • Highest disk overhead of all RAID types - inefficient due to the duplication of Write tasks. • Typically the RAID function is done by system software, loading the CPU/Server and possibly degrading throughput at high activity levels. • Hardware implementation is strongly recommended. • May not support hot swap of failed disk when implemented in "software".
RAID 0 +1 (A Mirror of Stripes) • RAID Level 0+1 is implemented as a mirrored array whose segments are RAID 0 arrays. • RAID Level 0+1 requires a minimum of 4 drives to implement
RAID 10 (A Stripe of Mirrors) • RAID 10 is implemented as a striped array whose segments are RAID 1 arrays. • RAID Level 10 requires a minimum of 4 drives to implement.
RAID3 (Parallel access with a dedicated parity disk) • RAID Level 3uses byte-level striping with a dedicated parity disk. • This comes about because any single block of data will be spread across all members of the set and will reside in the same location. • So, any I/O operation requires activity on every disk.
RAID3 – Summary • Level 3 only requires one dedicated disk in the array to hold parity information. • The server's data is then striped across the remaining drives, usually one byte at a time. • The parity drive then keeps track of all the info on the striped drive(s) and uses it to restore info if the drive should fail. • Because of the parity information that is stored and because Write operations take place on a byte level, Read/Write operations often take longer than other RAID configurations.
RAID5 (Independent access with distributed parity) • A RAID 5 uses block-level striping with parity data distributed across all member disks. • A minimum of 3 disks is generally required for a complete RAID 5 configuration. • In the example, a read request for block "A1" would be serviced by disk 0. • A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1
RAID 5 – Summary • Level5alsoreliesonparityinformationtoprovideredundancyandfaulttoleranceusingindependentdatadiskswithdistributedparityblocks. • Eachentiredatablockiswrittenonto a datadisk; parityforblocksinthesamerankisgeneratedonWrites, recordedin a distributedlocationandcheckedonReads.ComparedtoRAID3, RAID5usesstripingtospreadparityinformationacrossmultipledrives. • Requirements:RAIDLevel5requires a minimumof3drivestoimplement.
SSA (Serial Storage Architecture) (1) • SerialStorageArchitecture (SSA) defines a high-performanceseriallinkfortheattachmentofinput/outputdevices. • Ithasbeenoptimizedforstorageapplicationssuchasharddiskdrives, hostadaptercards, andarraycontrollers. • SSAhasmanyadvantagesoverexistingparallelinterfacessuchastheSmallComputerSystemsInterface (SCSI-2). • Itusescompactcablesandconnectors, andithasbetterperformance, connectivity, andreliability.
SSA (Serial Storage Architecture) (2) • DiskSubsystemprovide a peakdatarateof20MB/s ineachdirection. • However, a typicalloopconfigurationwithonehostadaptercanprovide a totalsustainedbandwidthofupto80MB/s, andhigherspeedsarebecomingavailable. • Thephysicalmediumisusually a coppercableupto20meterslong, butfiberopticscanalsobeusedforlongerdistances.
SSA (Serial Storage Architecture) (4) • Architectureoverview • SSAisdefinedinthreelayers: • SSA-PH1definestheelectricalspecifications, cables, andconnectors. • SSA-TL1is a general-purposetransportlayer.Itdefinesthetransmissionprotocol, configuration, anderrorrecovery. • SSA-S2Pis a mappingoftheSCSI-2queuingmodel, commandset, status, andsensebytes.
Storage Area Network • A SAN is a specialized, high-speed network attaching servers and storage devices • It is sometimes referred to as “the network behind the servers.” • A SAN introduces the flexibility of networking to enable one server or many heterogeneous servers to share a common storage utility, which may comprise many storage devices, including disk, tape, and optical storage.
SAN Component • SAN Connectivity • the connectivity of storage and server components typically using Fibre Channel (FC). • SAN Storage • TAPE /RAID /JBOD (Just Bunch of Disk) /SSA (Serial Storage Architecture) • SAN Server • Windows /Unix /Linux and etc
Switched Fabric • An infrastructure specially designed to handle storage communications called a fabric. • A typical Fibre Channel SAN fabric is made up of a number of Fibre Channel switches. • Today, all major SAN equipment vendors also offer some form of Fibre Channel routing solution, and these bring substantial scalability benefits to the SAN architecture by allowing data to cross between different fabrics without merging them.
Fiber Channel protocol (1) • FC0Thephysicallayer, whichincludescables, fiberoptics, connectors, pinoutsetc. • FC1Thedatalinklayer, whichimplementsthe8b/10bencodinganddecodingofsignals. • FC2Thenetworklayer, definedbythe FC-PI-2 standard, consistsofthecoreofFibreChannel, anddefinesthemainprotocols.
Fiber Channel protocol (2) • FC3Thecommonserviceslayer, a thinlayerthatcouldeventuallyimplementfunctionslikeencryptionorRAID. • FC4TheProtocolMappinglayer.Layerinwhichotherprotocols, suchasSCSI, areencapsulatedintoaninformationunitfordeliverytoFC2.
Storage Management • Monitoring disk use • Disk monitor agent scans the server volumes to collect disk use information • Hierarchical storage management • Files will be archived according to certain criteria • Prevention against Data Loss • To protect and recovery from loss • Outsourcing storage management
Monitoring disk use • One or more the following categories of information can be collected • Volumes:(Disk) total space : used /available • Directories: what are there • Directory and File owners: who create / who use /when create
Hierarchical storage management • When disk space becomes exhausted , data files need to be backup (as archived file or back up tape) • Software tools (Back up tools) • When a file system reaches a predefined threshold of X percent full • automated procedure are initiated that determine which files are eligible for archive and are currently backed up • The file catalog is then updated to indicate that files have been archived and deletes them from the disk file system
Prevention against data loss (1/2) • Data perspective • Backupssentoff-siteinregularintervals • Use a Remotebackupfacilityifpossibletominimizedataloss • StorageAreaNetworks (SANs) overmultiplesitesmakedataimmediatelyavailablewithouttheneedtorecoverorsynchronizeit
Prevention against data loss (2/2) • Facility perspective • SurgeProtectors — tominimizetheeffectofpowersurgesondelicateelectronicequipment • UninterruptiblePowerSupply (UPS) and/orBackupGenerator • FirePreventions — morealarms, accessibleextinguishers • Anti-virussoftwareandothersecuritymeasures
Techniques to prevent data loss • Mirroring • Disk mirroring : Redundant arrays of inexpensive disks 1 (RAID1) • Server mirroring: web / ftp /email • On-site data storage • Back up - Tape / optical disk • Off-site data storage (backup-site) • Cold sites • Warm sites • Hot site
Mirroring • Mirroringcanoccurlocallyorremotely. • Locallymeansthat a serverhas a secondharddrivethatstoresdata. • A remotemirrormeansthat a remoteservercontainsanexactduplicateofthedata. • Dataiswrittentotheoriginaldrivewhen a writerequestisissuedandthencopiedtothemirroreddrive, providing a mirrorimageoftheprimarydrive.
Disk mirroring (RAID1) • The replication of logical disk volumes onto separate physical hard disks in real time to ensure continuous availability, currency and accuracy. • A mirrored volume is a complete logical representation of separate volume copies
Server mirroring • Mirror sites are most commonly used to provide multiple sources of the same information, and are of particular value as a way of providing reliable access to large downloads. • Web server • To preserve a website or page, especially when it is closed or is about to be closed • Load balancing • Email server • To protect loss of email information • ftp server • To allow faster downloads for users at a specific geographical location • Load balancing
Back up site • A backup site is a location where a business can easily relocate following a disaster, such as fire, flood, or terrorist threat. This is an integral part of the disaster recovery plan of a business. • A backup site can be another location operated by the business, or contracted via a company that specializes in disaster recovery services. • In some cases, a business will have an agreement with a second business to operate a joint disaster recovery facility.
Cold Sites • A cold site is the most inexpensive type of backup site for a business to operate. • It provides office spaces to operate • It does not include backed up copies of data and information from the original location of the business, nor does it include hardware already set up. • The lack of hardware contributes to the minimal startup costs of the cold site, but requires additional time following the disaster to have the operation running at a capacity close to that prior to the disaster.
Warm Sites • A warm site is a location where the business can relocate to after the disaster that is already stocked with computer hardware similar to that of the original site, but does not contain backed up copies of data and information.
Hot Sites • A hot site is a duplicate of the original site of the business, with full computer systems as well as near-complete backups of user data. • Ideally, a hot site will be up and running within a matter of hours. This type of backup site is the most expensive to operate. • Hot sites are popular with stock exchanges and other financial institutions who may need to evacuate due to potential bomb threats and must resume normal operations as soon as possible.
How to choose (1) • Choosing the type is mainly decided by a company's cost vs. benefit strategy. • Hot sites are traditionally more expensive than cold sites since much of the equipment the company needs has already been purchased and thus the operational costs are higher. • However if the same company loses a substantial amount of revenue for each day they are inactive then it may be worth the cost.
How to choose (2) • The advantages of a cold site are simple--cost. It requires much fewer resources to operate a cold site because no equipment has been bought prior to the disaster. • The downside with a cold site is the potential cost that must be incurred in order to make the cold site effective. • The costs of purchasing equipment on very short notice may be higher and the disaster may make the equipment difficult to obtain.