470 likes | 627 Views
Storage Solutions for PACS. What you do and don’t need! David Harvey With thanks to : Jacob Farmer, Cambridge Computer Inc (USA). Storage Issues Discussed. Capacity requirements Terminology Backup Cost. PACS Pricing Issues.
E N D
Storage Solutions for PACS What you do and don’t need!David Harvey With thanks to : Jacob Farmer, Cambridge Computer Inc (USA)
Storage Issues Discussed • Capacity requirements • Terminology • Backup • Cost
PACS Pricing Issues • In early PACS, storage was the major cost as disk was so expensive • Required Storage space has increased massively (esp. Multi-slice CT) • But is now massively cheaper • PACS is now like a RIS – the major cost is the complexity • But PACS is far simpler than a RIS and still more expensive!
How Much Storage do we Need ? • Usage Always increasing • Used to be dominated by CR • CT has taken over • MRI not far behind
Ways of Storing PACS Data • Disk • RAID (multiple attachment methods) • Optical • MOD • CD / DVD • Tape • Many sorts • Hierarchical Storage (mix of above)
The Disk “Variables”(Jargon Buster) • Type of Drive • Speed/Reliability/Heat & Vibration Tolerance • Local Interfacing • ATA/SATA/SCSI/SAS/Fibre Channel • Server Connection • Direct Connection (any one of above) • Block access: SAN/iSCSI • File Access: Fileserver/NAS • RAID Configuration
Drive Types • Desktop / Laptop Marketplace • Typ. 7200 rpm • Low heat and vibration tolerance • Normally ATA/SATA • Enterprise Marketplace • Typ. 10000-15000 rpm • Good heat and vibration tolerance • Designed to be “close mounted” • Normally SCSI/SAS/FC
Storage Interface Choices • SCSI – Now called “Parallel SCSI” • Familiar daisy chained bus (various speeds/formats). • FCAL – Fibre Channel Arbitrated Loop • Familiar serial interface for SCSI in a loop topology. • SAS – Serial Attached SCSI • A new serial interface for SCSI drives and storage subsystems • ATA – “AT Attached” a/k/a IDE • Familiar drive interface used for personal computers. • SATA – Serial ATA • A new incarnation of ATA that uses a serial interface. • Intrinsic speed differences are not great, but tend to be linked to other factors as previous slide
Serial v. Parallel Interfaces • Should make little or no difference to you. • You should buy the storage system that meets your needs and budget and not worry about the components inside. • Serial ATA v. Parallel ATA • Despite all of the market hype about SATA, the vast majority of ATA enterprise storage systems use the parallel interface not the SATA interface.
Types of Disk Interconnection • Simple Direct Connection to Server • Needs downtime to connect • Includes ATA, SATA, SCSI & SAS • File Server (Network Attached Storage) • Very flexible • You buy a new server with every disk set • Storage Area Network using Fibre Channel • Very flexible option, but expensive • Intended for server consolidation • iSCSI • SCSI via a network • Seems very promising but relatively new • Content Addressable Storage (CAS) • Proprietary filing system
File Servers are Great for PACS Storage • Network file serving protocols are very mature. • Everyone supports them. • NFS (Network file system) – UNIX • CIFS (Common internet file system)-Windows • Netware File System – Novell Netware • By using network file servers you eliminate any interoperability issues. • If your PACS software works with a file server, it will work with any file server and any storage behind the file server.
What Do SANs Get You? • A better, more flexible way to plug in storage devices. • Great if you need a better, more flexible way to plug in storage devices. • Not such a big deal if you don’t need it. • Centralized storage administration • Great if you have a lot of things to manage. • Enables more powerful backup solutions • But you do have to buy these separately. • Only useful if doing “full” backups • BTW, you can do SAN backup without SAN disk
Who Needs a SAN? • SANS are designed to provide a single disk storage platform to a diverse and complex enterprise. • The more servers you have and the more often you are making moves and changes, the better your justification for a SAN. • When you buy a SAN, you are buying management and paying a premium for capacity. • Hospital IT departments are very good candidates for SANS. • Hospitals typically have many vendors and they struggle to maintain storage standards. • Sometimes hospitals have many SANs.
SANs Are Not Necessarily a Fit for PACS • Hospital IT is willing to pay a premium for SAN storage because they need the management features. • PACS storage consumption can drive up the cost of the SAN for the rest of hospital IT. • The cost of SAN storage might preclude the expansion of the PACS system and/or the adoption of new modalities.
Fibre Channel is Not Just for SANs • Use FC to connect multiple storage devices to a single server. Add more disk without down-time. Fibre Channel -Attached Disk Arrays
Something New iSCSI • Actually, not that new. iSCSI solutions have been shipping for over 2 years. • Alternative to Fibre Channel • SCSI in a Star Topology, but • SCSI runs over TCP/IP and Ethernet instead of Fibre Channel • Surprising performance. Many iSCSI systems out-perform Fibre Channel systems. • Adds new confusion to term Network-Attached storage.
CAS – Content Addressable Storage • Proprietary technology from EMC although others have borrowed the concepts. • 3rd party products on the market today. • EMC is targeting the jukebox marketplace. • The fact that it’s “content-addressable” is not terribly relevant to PACS. What is relevant: • Data is not addressed in the conventional way of Drive:\Directory\Subdirectory\File.Ext • Pros - harder to hack and easier to manage expansion • Cons - vendor lock-in, in some cases performance • Capable of write-once-read-many (WORM)
What is RAID ? • Redundant Array of Inexpensive Disks • A means of being able to tolerate a disk failure • Uses a simple mathematical operation called “parity” • Various “levels” possible, but RAID 5 is the most common • Sacrifices some capacity to improve reliability
Types of RAID/Disk • Desktop/DIY • Direct Attachment/Software RAI • ATA/SATA RAID • Commonly SCSI or firewire externally • SCSI RAID • “Standard” very high quality RAID • Managed Solutions • Lots of bells and whistles !
“Enterprise Class” Disk Storage • “Enterprise” storage systems are designed and priced to address the complexities of an enterprise data centre. • Computers of different vintages • Computers running different operating systems • Different applications with different needs • Regular moves, adds, and changes • High data throughput • You pay a premium for management tools • The cost of storage management is often more than the storage itself.
TWO sorts of PACS storage requirements • For the “Database” • A Classic “enterprise” requirement • High throughput • Small Size (normally only a few GByte) • For the “Images” • Rapid Access • Reliability • Low Management costs • Ability to Expand as needed
PACS Image Storage is Easy • PACS applications tend to be storage smart • PACS storage architectures are relatively simple. • Most files (images) are only ever accessed a few times • Speed limit is the network, not the Disk • The only thing that makes PACS storage unusual is the never ending capacity demand • All you really need for PACS is a way to cope with ever-growing capacity.
The Perfect PACS Storage System • Fast enough • Not necessarily the fastest • System speed is more likely limited by network • Easy to back up • Live copy of data at an offsite location • Easy to expand without shutting the system down. • Inexpensive enough not to limit your capacity
How much “management” does a PACS archive need? • Indexing data • full management and recovery etc. • Image Data • Two or more secure and separated locations • A few backup copies of each image • Occasional “readability” audit
Hierarchical / Tiered Storage • As studies age they are migrated to less expensive and slower access media. • Online: Data resides on disk for fast, easy access. • Near line: Data resides on a jukebox where it can be easily accessed albeit with some latency. • Offline: Data is not automatically accessible. Some administrative process is needed to access the data.
Shortcomings of HSM & Jukeboxes • Mechanical devices. • Slow access • Maintenance headaches • Jukebox model makes you pay a lot up front. • Locks you in to today’s cost of storage. • Tape & MOD are having a hard timekeeping pace with the declining cost of disk. • “Complexity Costs” • Software to keep track of what is where • Interfaces to trigger prefetch requests
Eliminating the Jukebox • Many organizations are choosing to eliminate the jukebox from tiered storage. • A few options • Keep tiered storage model and use CAS instead of jukebox. • Skip the tiered storage model and do everything on disk arrays. • Mirror your disk devices for fault tolerance and run a separate tape backup. • If you currently have HSM, consider moving the tape jukebox from HSM role to a backup role.
Backup vs HSM (Nearline/Offline) • Backup is for disaster recovery • You hope never to have to use it • Tape is great for backup • Do not confuse Backup and HSM • HSM makes poor backup: • It is on-line and susceptible to viruses etc. • And many people keep their jukeboxes in the same room as their RAID!
You Need Tape Backup • Bad things happen to data all the time. • Natural disaster • User error • Computer Viruses • Sabotage • You should have an off-line, off-site copy of your data.
“Normal” Tape Backup • Tape rotation • Full backups on the weekend • Incremental backups during the week • Full backup again the next weekend • This approach is inefficient for files that don’t change. • A file that has not changed in 5 years, gets backed up 260 times!
Tape Rotation Does Not Work for PACS • Too much data to perform full backups regularly • On average, a conventional NAS or file server will back up at a rate of 2TB per day. • If you have 20TB, that’s 10 days to do a full backup. • Tape hardware requirements are excessive. • 20TB = 100 LTO-2 Tapes
How Do You Back Up PACS • Backup software that does not need to do full backups regularly. • Incremental forever • Synthetic Full Backup • Tapes are not reused • Ideally they go directly into a fire-safe and then off-site
On-Line Redundancy • Even RAID can fail • Backups take a long time to restore • Mirrored Storage is a great solution to keeping “live” when failures occur • Can be done at many different “levels” • Hardware/File Servers (transparent to OS) • Operating System (with clustering) • Application (PACS Server) • Multiple Application (regional archive)
Put Storage Devices in Different Buildings • If you have the network connections, you can put the two file servers in different buildings. • Ideally, put the tape system in a different building. • Still take tapes off site. • The data is too valuable.
Commercial/Cost Issues • How do you buy your PACS Image storage • From the PACS Vendor • From a 3rd Party • “Consolidated” with other IT systems in the hospital • DIY • What sort of contract should you have?
How Much should we pay? • Storage should now be one of the cheapest components of a PACS:
Buy Database Server form PACS Vendor • Have your PACS vendor supply the database server, DICOM servers, and other specialty computers. • It will simplify support. Eliminate finger pointing. • Buy file storage from the open market • Take advantage of commodity prices. • Easier cost justification for mirrored storage. • Use conventional file server technology • Industry standard stuff. • Your PACS vendor is out of touch if they refuse to support you using a standard file server.
The PACS pricing Problem • PACS pricing is still seen as based on storage space • So vendors make price proportional to space, at huge markups • They may in fact be undercharging for the “core” components and their expertise • RAID increases sometimes tied to upgrades • Result is undersized systems and reduced reliability due to lack of mirroring
PACS Pricing Suggestions • Insist on the best possible RAID/disks/backup for the database • Specify “off-the-shelf” ATA RAID for your image storage at commodity prices with dual-site duplication • Only buy RAID when you need it, as it is constantly getting cheaper • Pay more for the core components to compensate PACS companies for the profit they were hoping to take on RAID upgrades once you realise that you haven’t got enough
Things to Ask your PACS/Storage Supplier • Do they truly understand the difference between PACS image storage and classical “enterprise” database requirements? • Will they add storage incrementally as you need it • How much will they mark up the price compared to “industry standards” • Can you add as much add you like when you like • How many copies do they keep (and where?)
Conclusions • RAID is dirt cheap • PACS does not need “enterprise” RAID for Image Storage • You should be paying your PACS company properly for what they are good at (management and integration), not for what they are NOT good at (value for money storage) • A good contract on these principles will help you both.
Open Questions • How do we solve the confidence problem? (“No-one ever got sacked for buying the best”) • How does this all relate to CfH? • How long before PACS is cheaper than current RIS ?