1 / 1

Hybrid Cloud Security: Replication and Direction of Sensitive Data Blocks

Hybrid Cloud Security: Replication and Direction of Sensitive Data Blocks Glenn Michael Koch Eric Drew Advisor: Dr. XiaoFeng Wang Mentor: Kehuan Zhang School of Informatics and Computing, Indiana University, Bloomington Indiana. INTRODUCTION

kelvin
Download Presentation

Hybrid Cloud Security: Replication and Direction of Sensitive Data Blocks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hybrid Cloud Security: Replication and Direction of Sensitive Data Blocks Glenn Michael Koch Eric Drew Advisor: Dr. XiaoFeng Wang Mentor: Kehuan Zhang School of Informatics and Computing, Indiana University, Bloomington Indiana INTRODUCTION Processing of large scale data sets in a cloud computing environment carries inherent security concerns(seeFIGURE 2) . Data sent out to public commodity servers is at greater risk of being compromised than data that is kept on local servers. A hybrid cloud solution involves separating sensitive data which is confined to a private domain (private cloud), from public data (public cloud).This research involves one component of the hybrid cloud security solution, the replication and direction of sensitive data with changing replica values. The task was to create and modify java source code within the Hadoop Distributed File System, to implement alternative replication factors and then test to verify that data was replicated to the proper domain based on its security tag. HADOOP AND CLOUD COMPUTING Hadoop is a set of open source technologies that supports reliable and cost-efficient ways of dealing with large amounts of data[1]. The exponential growth of individual data footprints, as well as the amount of data generated by machines[2] calls for a means to process said data. Hadoop is able to deal with large amounts of data because it splits it into subsets and sends the data to multiple processors. Multiple processors tied together will process the data at a much higher rate and then Hadoop reassembles the data into a single result set. Complex data operations shifted to clusters of computers are known as clouds[3] and software such as Hadoop orchestrates the merging of these processes. Hadoop in its present form does not provide data security. Hadoop does provide data replication, for the purposes of performance enhancement and fault tolerance, but does not distinguish private from sanitized data. Our work is to modify data replication and control in a hybrid cloud structure. PUBLIC CLOUD DATA NODE DATA NODE REPLICATE DATA DATA NODE NAMENODE PRIVATE CLOUD 2. SET UP DATA REPLICATION PIPELINE 4.REPLICATE SANITIZED DATA DATA NODE 1.REQUESTTO ALLOCATE BLOCK REPLICATE DATA DATA NODE 3.TRANSFER DATA TO PRIVATE DATA NODE CLIENT FIGURE 3: Hybrid cloud structure DATA NODE • HYBRID CLOUD DATA REPLICATION • Data replication in a secure Hybrid Cloud environment involves: • Replicating data that is tagged sensitive only to private nodes as identified in namenode metadata • Replicating sanitized or public data to random nodes, either public or private as to provide optimum performance and fault tolerance PUBLIC CLOUD DATA NODE REPLICATE DATA DATA NODE NAMENODE SET UP DATA REPLICATION PIPELINE REPLICATE DATA REQUESTTO ALLOCATE BLOCK DATA NODE TRANSFER DATA TO DATA NODE FIGURE 1: Hadoop original structure CLIENT • EDITED HADOOP JAVA CODE • The original Hadoop system was designed to work on a single cloud (figure 1). • Thus Hadoop is not designed to automatically detect sensitive data to ensure that this data will be secure and prevented from being accessible by public cloud. • Here we modified the original java code to be able to distribute the data over the public and private clouds, while keeping data that is considered sensitive on the private cloud (figure 3). • The code makes two distinct calls to the public and private cloud which is distinguished be a true or false value. • REFERENCES • http://atbrox.com/2010/02/17/hadoop/ • Hadoop, The Definitive Guide p.2 • http://www.businessweek.com/magazine/content/07_52/b4064000281756.htm FIGURE 2:Source: Awareness, Trust and Security to Shape Government Cloud Adoption, Lockheed Martin, LM Cyber Security Alliance and Market Connections, Inc. April, 2010 **This project is supported in part by NSF CNS-0716292.

More Related