380 likes | 485 Views
Management of large scale Terabyte Store information servers. Third Indo-Australian Conference on Information Technology Security 10 th July 2007. S. Ramakrishnan ramki@cdac.in. WELCOME. Presentation Outline. Introduction Storage architectures Management of storage servers
E N D
Management of large scale Terabyte Store information servers Third Indo-Australian Conference on Information Technology Security 10th July 2007 S. Ramakrishnan ramki@cdac.in WELCOME
Presentation Outline • Introduction • Storage architectures • Management of storage servers • Security issues • Conclusion
Everything can be stored… • The total number of different books produced since printing began does not exceed one billion • If an average book occupies 500 pages at 2,000 characters per page, then even without compression it can be stored comfortably in one megabyte • 1 billion megabytes or 1015 bytes or one petabyte is sufficient to store all books • At commercial prices of $20 per gigabyte, this amount of disk storage capacity could be purchased for $20 million
C-DAC PARAM • 2003 (PARAM Padma) • 5 TB of Primary storage • 12 TB of Backup & Archival • 2007 (NextGen PARAM) • 200TB of Primary Storage • 1 PetaByte for backup / archival • Storage requirement increased by 40 times over • a period of 4 years • I/O Bandwidth requirement of WRF • application is 10GB/s for 12km • domain STORAGE REQUIREMENT DRIVEN BY SIZE & PERFORMANCE
DataAccumulation PerformanceDemand 8X 300% 7X 253% 250% 6X 200% 50%Annual Data Growth Rate 5X 169% 7.6X 4X 150% 113% 3X 100% 5.1X 2X 75% 3.4X 50% 50% 2.3X 1X 1.5X 1X Today In 1 Year In 2 Years In 3 Years In 4 Years In 5 Years Data Growth and Performance Need
Oil and Natural Gas Corporation (ONGC) • ONGC discovers oil-wells through Seismic data processing techniques and data visualization • During a seismic survey huge volumes of data are generated approx file size of 60 GB • Processed Seismic data files can be in the range of 10-50 TB per survey location • There is a requirement to store each and every survey carried out • Hence storage requirement:Ever growing
State Bank of India • Total 14000 branches all over India • 11,000 offices connected to one centralized Data Center • Customer base of more than the population of Australia • Around 100 TB of primary storage • Security through • Compression / Encryption • Antivirus • Firewall and IDS
Storage: Indian Scenario • 3000 TB of storage was procured in 2003 • IDC had predicted that the Indian storage market will grow at a annual growth rate of 65 percent up to 2007 • SME, Telecom, BFSI, Manufacturing, and BPO have been the biggest spenders during this period • Storage management challenges: • Cost • Interoperability • Protection of data. • Scalability
Direct Attached Storage Network Attached Storage Storage Area Network Types of Storage
Storage Devices • Fixed Location • Storage arrays of hard disks • Tape Drive/Tape Library • Portable • USB flash disk • iPOD / iPhone • Mobile with a camera, memory card • CD/DVD • Network
Storage Technologies • Media • Fiber Channel / SAS / SATA / FATA • DLT / SDLT / LTO tapes • Protocols • iSCSI (SCSI over IP) • iSER (iSCSI Extensions for RDMA)
High-End$40/GB 100% High Performance Time Critical 99.999% Midrange$20/GB Business Need SATA$5/GB 99.9% Cost Critical High Capacity Long Term Tape$0.5/GB Mapping Data to Architecture
Data Access network wireless / mobile console physical
Follow Your Data… DesktopsLaptops Data Base Tier Web Tier App Tier People & Things Devices
High-End Midrange Low Cost Media Archival Various Storage Servers App Servers Main Frame Web Servers Data Base Servers StorageNetwork
Storage Management Process Management Service Management Platform Operational Management Best Practices Software building blocks for storage management Storage Process Manager Change and ConfigurationManagement Database • Configuration management • Automated workload-based provisioning • Policy-based storage security validation 17
STORAGE SECURITY IS EXTREMELY IMPORTANT AS INFORMATION LIVES IN STORAGE
High Performance, Petascale Storage Components: Parallel File System • 3 Components • the client • metadata server • cluster (MDS), and a cluster of storage devices, such as network-attached disks or object storage devices (OSD) • Key concept: Decoupling of metadata and data paths. Clients communicate all namespace operations, such as open(), to the MDS and all file I/O operations, such as read() and write(), to the storage devices
High Performance, Petascale Storage Components: Parallel File System Petascale file systems are a much more challenging environment to secure for the following reasons: • Highly large and distributed data • Huge number of clients • Many storage devices • Demanding I/O in terms of GBytes/s • Varying access patterns • Threat environment: No implicit trust placed on the clients by design
High Performance, Petascale Storage Components: Grid File System • Global logical namespace with UNIX like directory structure • Federation of distributed storage systems • Support for heterogeneous data objects • Location independent data • Data fetched on-demand directly by applications through standard interface like POSIX Raw input data from NCEP, Maryland Processing at C-DAC, Pune Output to Any other location
High Performance, Petascale Storage Components: Grid File System • Highly large and geographically distributed data of varying nature • Heterogeneous storage devices • Varying access patterns • Stringent operational policies • Every time data is called for it’ll be on wire: Security challenge
Where really is my data? • When information is digitized and accessible over a network, it makes little sense to speak of its "location," although it is technically resident on at least one storage device somewhere, and that device is connected to at least one computer • If the information is available at multiple mirror sites, it is even less meaningful to speak of it being in a "place“
Storage Risks • Theft • Of laptops, computers, portable storage devices etc. • Loss • In transit / shipment • Unauthorized access / manipulation : Confidentiality and Integrity issue • External intrusion • From within the organization • Capture
High Very high risk Medium risk Threat Probability Low risk High risk Low High Consequence Low Risk Classification Matrix
Storage Security Administration The keys to successful security management • Smart objectives • Specific • Measurable • Achievable • Realistic • A Plan to implement these objectives • The Resources to carry out the plan • Accountability
Detailed Vulnerability Assessment An example with data that is visible via ‘http’ access
Data Theft • December 2002: US Department of Defense’s TriCare system, announced the theft of computers and files from its Phoenix offices. The stolen data included names, addresses, Social Security numbers, and other personally identifiable information such as diagnoses • January 2004: Airlines Reporting Corp., reported that two computers, one containing airline ticketing data, had been stolen. The stolen data included confidential customer information • May 2005: Long-distance carrier MCI investigated the loss of employee data after a laptop was stolen from an MCI financial analyst’s car. The laptop contained names and Social Security numbers of about 16,500 employees. A company spokesperson said the machine was password protected but didn’t indicate whether the employee data were encrypted
Data Loss • December 2004: Bank of America announced that tapes containing personal information on 1.2 million federal employees were lost in shipment. Customers affected included members of Congress • May 2005: TimeWarner reported that tapes containing personal information for 600,000 current and former employees were lost in shipment • June 2005: CitiFinancial notified some 3.9 million US customers that computer tapes containing information about their accounts—including Social Security numbers and payment histories—had been lost. Parent company Citigroup said that the courier UPS lost the tapes on their way to a credit bureau • July 2005: Boston’s Iron Mountain, the world’s largest data-archiving company, reported that it had misplaced data backup tapes belonging to City National Bank of Los Angeles
Data Capture • January 1968: The USS Pueblo, a US Navy spy ship gathering intelligence signals off the coast of North Korea, was captured, and a crew member died in the process of physically destroying critical information • April 2001: A US Navy EP-3E surveillance aircraft was forced to land in China after colliding with a Chinese F-8 fighter. Although the crew apparently succeeded in destroying information before capture, the incident highlighted the vulnerability of military data
Data Protection: Encryption In Band Appliances In Device At Creation
Three Key Elements Needed for Data Encryption on Tape or Disk “Crypto-Ready” Drive Key Management Station Key Transmission
Unauthorised access / manipulation • Packet sniffing -> gain account names / passwords • Website data manipulation • University database access and manipulation by student hackers • Patient prescription manipulation • Bank account tampering
Formulate IT Security & Policies Defining Importance of Information Threats • Employees • Hackers • Competitors Developing Information Security Policies and Procedures Vulnerabilities Establishing Information Security Architecture
Secured IT Infrastructure Solution Implementation as per policies Solution Integration Security Implementation – Building Blocks Encryption Solutions Intrusion Detection Vulnerability Scanning PKI Solutions Penetration Testing Tools Firewall Solutions Directory Services VPN Solutions Remote Access Solutions Solutions Based on Business Specific Need Security Architecture Security Profiling Storage Servers and Data Security Architecture Implementation
Conclusion • Management of Large-Scale, Terabyte store information servers is not going to be an easy task • Storage and Data owners need to visualize all aspects of where and how far their data is going to travel • All access points, physical and through network, need to be studied • As far as possible, data needs to have a backup copy • All important data should be encrypted • A mechanism to auto-destruct in-case of intrusion, capture needs to be thought of
Advanced Computing for Human Advancement Thank You! www.cdac.in 38