1 / 13

Google for storage within an enterprise _____________________

Google for storage within an enterprise _____________________. Ibrahim El-Dewak 7/24/2004 Pace University School of computer Science and information Systems. Background.

nicole-cash
Download Presentation

Google for storage within an enterprise _____________________

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Google for storage within an enterprise_____________________ Ibrahim El-Dewak 7/24/2004 Pace University School of computer Science and information Systems

  2. Background • Google and other search engines are available for the web and internet but there is nothing within an enterprise that is focused on storage that utilizes user context specific and how it relates to storage objects within the enterprise • The idea is that metadata (name, creation date, last accessed date, size, owner access rights) related to storage objects/files within an enterprise is gathered and merged with user profile data (name, job type, position, etc) so that views could be made available to Administrators and users based on their credentials and search criteria

  3. Research Problem As more and more data is deposited into storage device, it is becoming increasingly difficult to locate/search for previously saved files. People tend to forget and misplace files within their own folder environment…. How do you look for a file on a Petabyte storage device? There is a need to reconsider how data should be organized, partitioned and stored.

  4. Research Summary • The Research idea here is how can we automate to “google” all the files on a large storage device such as NAS (Network Attached Storage) or Symmetrix device. • Offer a simple search mechanism personalized to each user of the device. • Gather Metadata related to storage objects and merge it with user profile data. • Generate data views based on user credentials.

  5. Continue Research Summary • Going further, the data could be indexed and views ranked based on file contents, file metadata, user profile and context from previous searches • As we work with Metadata a value may be assigned to file storage object based on file metadata and user profile • Create policies based on value assigned to Metadata

  6. Continue Research Summary • Policy can be used to ensure that the files are stored most cost effectively • Policy can be enforced to meet data retention regulations and enterprise requirements

  7. Research Results (How to do it) • Design an extensible tag search method and resorted data views based on user preferences . The research will revolve around data stream process and tagging • Develop a method for hashing a file to a unique key, validating this key against what is already in the system (global store) with the links to who owns the file

  8. Research Results (How to do it) The real research here is to determine the appropriate method and keying to collect “hints” on file coming into the box and cross-matching these hints into a per-user search match store.

  9. Relevance and significance of the Research • The idea of “google” for storage takes advantage of enterprise environment . • Where unlike the internet, user profile is available and employee’s job function is known and can be taken into account when listing, or searching. • Determine highest ROI (Return Of Investment) of sharing data storage • Provide most cost-effective options for storage • Deliver maximum value at the lowest TCO ( total cost of ownership)

  10. Make storage most cost effective Example: A trader transaction logs might be kept on high speed storage such as a Symmetrix with RAID -1 for 30 days, then moved to a RAID-5 NAS device for 6 months and then to ATA disks (Cheap Store) for 3 years before being migrated to tape. Conversely, traders might not be allowed to store MP3s at all although a person in marketing working on advertisements might be allowed to store media files

  11. Related Work “Metadata’s Role in a scientific Archive”:

  12. References Metadata’s role in a Scientific Archive Judi Thomson, Dan Adams, Paula J. Cowley, Kevin walker Publication Date: December 2003, pp. 27-34 Visualising document Content with Metadata to Facilitate Goal-directedSearch Mischa Weiss-Lijn, Janet T. McDonnell, Leslie James University of college London, London Publication Date: July 2001 The Skinny on Metadata IEEE Intelligent Systems Giovanni Flammiam Publication Date: July 1999, pp. 20-22 A visual Representation of Search-engine queries and their results Ratvomder Singh Grewal, Mike Jackson, Peter Burdenm Jon Wallis Publication date: June 2000, pp. 0352 Tag Insertion complexity Yeates, I.H. Witten, D. Bainbridge Publication date: march 2001, pp. 0243

  13. References A similarity search Method of Time series Data with Combination of Fourier and Wavelet Transforms Kyojj Kawagoe, Tomohiro Ueda Publication Date: july 2002 A meta-Search method Reinforced bu cluster Descriptors Yipeng, Shen, Dik Lun Lee Publication Data Cecember 2001, pp. 0125 An efficient Hash-based method for Discovering the Maximal Frequest Set Don-lin Yang, ching-ting Pan, Yeh-Ching Chung Publication Date: October 2001, pp. 511

More Related