Google for storage within an enterprise _____________________

Google for storage within an enterprise_____________________ Ibrahim El-Dewak 7/24/2004 Pace University School of computer Science and information Systems

Background • Google and other search engines are available for the web and internet but there is nothing within an enterprise that is focused on storage that utilizes user context specific and how it relates to storage objects within the enterprise • The idea is that metadata (name, creation date, last accessed date, size, owner access rights) related to storage objects/files within an enterprise is gathered and merged with user profile data (name, job type, position, etc) so that views could be made available to Administrators and users based on their credentials and search criteria

Research Problem As more and more data is deposited into storage device, it is becoming increasingly difficult to locate/search for previously saved files. People tend to forget and misplace files within their own folder environment…. How do you look for a file on a Petabyte storage device? There is a need to reconsider how data should be organized, partitioned and stored.

Research Summary • The Research idea here is how can we automate to “google” all the files on a large storage device such as NAS (Network Attached Storage) or Symmetrix device. • Offer a simple search mechanism personalized to each user of the device. • Gather Metadata related to storage objects and merge it with user profile data. • Generate data views based on user credentials.

Continue Research Summary • Going further, the data could be indexed and views ranked based on file contents, file metadata, user profile and context from previous searches • As we work with Metadata a value may be assigned to file storage object based on file metadata and user profile • Create policies based on value assigned to Metadata

Continue Research Summary • Policy can be used to ensure that the files are stored most cost effectively • Policy can be enforced to meet data retention regulations and enterprise requirements

Research Results (How to do it) • Design an extensible tag search method and resorted data views based on user preferences . The research will revolve around data stream process and tagging • Develop a method for hashing a file to a unique key, validating this key against what is already in the system (global store) with the links to who owns the file

Research Results (How to do it) The real research here is to determine the appropriate method and keying to collect “hints” on file coming into the box and cross-matching these hints into a per-user search match store.

Relevance and significance of the Research • The idea of “google” for storage takes advantage of enterprise environment . • Where unlike the internet, user profile is available and employee’s job function is known and can be taken into account when listing, or searching. • Determine highest ROI (Return Of Investment) of sharing data storage • Provide most cost-effective options for storage • Deliver maximum value at the lowest TCO ( total cost of ownership)

Make storage most cost effective Example: A trader transaction logs might be kept on high speed storage such as a Symmetrix with RAID -1 for 30 days, then moved to a RAID-5 NAS device for 6 months and then to ATA disks (Cheap Store) for 3 years before being migrated to tape. Conversely, traders might not be allowed to store MP3s at all although a person in marketing working on advertisements might be allowed to store media files

Related Work “Metadata’s Role in a scientific Archive”:

References Metadata’s role in a Scientific Archive Judi Thomson, Dan Adams, Paula J. Cowley, Kevin walker Publication Date: December 2003, pp. 27-34 Visualising document Content with Metadata to Facilitate Goal-directedSearch Mischa Weiss-Lijn, Janet T. McDonnell, Leslie James University of college London, London Publication Date: July 2001 The Skinny on Metadata IEEE Intelligent Systems Giovanni Flammiam Publication Date: July 1999, pp. 20-22 A visual Representation of Search-engine queries and their results Ratvomder Singh Grewal, Mike Jackson, Peter Burdenm Jon Wallis Publication date: June 2000, pp. 0352 Tag Insertion complexity Yeates, I.H. Witten, D. Bainbridge Publication date: march 2001, pp. 0243

References A similarity search Method of Time series Data with Combination of Fourier and Wavelet Transforms Kyojj Kawagoe, Tomohiro Ueda Publication Date: july 2002 A meta-Search method Reinforced bu cluster Descriptors Yipeng, Shen, Dik Lun Lee Publication Data Cecember 2001, pp. 0125 An efficient Hash-based method for Discovering the Maximal Frequest Set Don-lin Yang, ching-ting Pan, Yeh-Ching Chung Publication Date: October 2001, pp. 511

Google for storage within an enterprise _____________________

Google for storage within an enterprise _____________________

Presentation Transcript

5 Critical Components of an Enterprise Storage Management Framework

Vision for An Agile Enterprise

CMMS Implementation and Integration Benefits within an Enterprise GIS Environment

How to Implement Content Management within an Enterprise Environment

Enterprise Storage Lifecycle Management

Storage Solutions for the Efficient Enterprise

Nagyágyúk - Enterprise storage megoldások

DocuSign for Google Enterprise Applications

OptumHealth is an Enterprise Services Group Company Within UnitedHealth Group

Developing Mobile Applications within an Enterprise Architecture

Enterprise Deployment of Google Earth

Storage Trends: DoITT Enterprise Storage

Enterprise Storage EVA Update

Storage Quality of Service for Enterprise Workloads

Enterprise Storage Reinvented

Advantages Of Online Enterprise Storage

Unrivalled Enterprise Network Attached Storage

ERP for an Efficient Enterprise

Web Werks – Enterprise storage servers.

Google cloud image storage

SATA In Enterprise Storage

5 Critical Components of an Enterprise Storage Management Framework