1 / 0

INTELLIGENT WAYS OF RETRIEVING & REPRESENTING CONTENT ON THE WEB

INTELLIGENT WAYS OF RETRIEVING & REPRESENTING CONTENT ON THE WEB. ADITYA BIR: aab2178@columbia.edu COMS E6125 WEB-ENHANCED INFORMATION MANAGEMENT Spring 2011 Prof. Gail Kaiser. Information Overload. Information Overload. Web is flooded with information

amos
Download Presentation

INTELLIGENT WAYS OF RETRIEVING & REPRESENTING CONTENT ON THE WEB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INTELLIGENT WAYS OF RETRIEVING & REPRESENTING CONTENT ON THE WEB

    ADITYA BIR: aab2178@columbia.edu COMS E6125 WEB-ENHANCED INFORMATION MANAGEMENT Spring 2011 Prof. Gail Kaiser
  2. Information Overload
  3. Information Overload Web is flooded with information From this information more information is created User becomes overwhelmed Too much of information to be able to make a decision
  4. Way to Filter Information
  5. Way to Filter Information Today Web users just don't want to search for information on the web, in fact they want information to come to them. Various mechanisms have been proposed to extract, filter and present information to the user. Effectively extracting and organizing this information has been a challenge from the beginning.
  6. The Entire Process
  7. The Entire Process Searching Scanning Retrieving Extracting or Filtering Presenting
  8. THIN LINE BETWEEN INFORMATION RETRIEVAL AND INFORMATION EXTRACTION Information Retrieval Definition: A part of computer science that studies the retrieval of information from a collection of written documents is called Information retrieval (IR) . Information Extraction Definition: Information extraction (IE) denotes any activity whose goal is to automatically identify and acquire pre-specified sorts of information or data from natural language texts, aggregate them and store them in a unified and structured database.
  9. WEB BASED AGGREGATION OF METADATA Comparison Aggregation Relationship Aggregation
  10. Comparison Aggregation
  11. Relationship Aggregation People maintain multiple bank accounts They don’t want to remember logins Relationship Aggregation takes care of this hurdle With One logon a users information can be automatically retrieved Eg. Facebook and Google Accounts
  12. Technologies involved to make this happen TAGGING MASHUPS RSS
  13. TAGGING Tagging is a process by which users assign labels in the form of keywords to web objects with a purpose to share , discover and recover them.
  14. TAGGING Consider the example of Columbia University in New York. If you had to organize the information of Columbia University in a file system on your computer then you might organize Columbia University in either ways as follows: articles\United States\University articles\United States\New York
  15. FOLKSONOMY Aggregation of tags creates Folksonomy "Folk Taxonomy" led to the word Folksonomy which was coined by Thomas Vander Wal Folksonomy is a way in which a group of user who share a common vocabulary classify the objects with similar tags Folksonomy enables easy searching and aggregation of the metadata which can be presented to users with similar searching preferences
  16. ADVANTAGES OF FOLKSONOMY Folksonomy helps in saving cost of time and effort for users Foksonomy has a huge impact on communication and sharing of information as well as personal organization. Groups of users do not need to agree on hierarchical rules to tag web content, they just need to understand the meaning of the tag to label similar material. Since the web today has become very social in nature Folksonomy enhances the sharing as it has an underlying social networking nature built into it.
  17. DISADVANTAGES OF FOLKSONOMY Ambiguous documents can be retrieved as a result of the irregular and unsynchronized use of vocabulary for tagging Lack of Synonym control can lead to different word forms with different meanings
  18. DELICIOUS & FLICKR Delicious was a online bookmarking website where users were allowed to bookmark documents Further users were allowed to describe each bookmark with the help of a tag. Flickr on the other hand allowed users to tag photographs at the time of publishing. It also has a mechanism to allow friends and family to add tags to photographs but is limited to the consent of the creator of the content
  19. MASHUPS About three years ago, a hacker named Paul Rademacher found a security hole in the Google Maps web application. Rather than disclosing the issue privately to Google, he built and published an exploit—but instead of landing him in a courtroom, his exploit, the housingmaps.com mashup, landed him a job with Google. Moreover, instead of patching the security hole, Google documented it and called it an API
  20. MASHUPS Mashups is one of the components of Web 2.0 technologies It is a combination of application components such as Web Services, content and openly published API's which are used to dynamically extract information from two or more Web applications to create one integrated, intellectual dynamic entity . Mashups designers create Mashups which aim to enable web users with adhoc integration of a wide variety of applications, live data sources, services and rich navigation.
  21. MASHUPS
  22. TYPES OF MASHUPS Data mashups Bringing together and cross referencing data from various web sources Consumer mashups Different visualizations and data elements for more appealing consumption of information Business mashups Internal combinations of company resources, often enhanced with external web services
  23. Advantages of Mashups Mashups enable you to effectively leverage Web Parts. It enables in reuse of already existing Web application and therefore reducing time and cost from prevent the rebuilding of similar applications. Mashups are easier to create and are light weight by nature. Mashups enable users to create customized applications of their own interest by simply integrating content from different sources.
  24. Disadvantages of Mashups Although Mashups provide a dynamic integration of information, Mashups by nature can be very insecure, as the information presented to a user may raise various privacy and policy concerns. Although the number API's tend to grow over the period of time there are not many Web Services available to support Mashups. There are no Mashup Standards for building of Mashups. The content in a Mashup that is being extracted from a particular source cannot be guaranteed in its integrity.
  25. YAHOO PIPES
  26. RSS (REALLY SIMPLE SYNDICATION) RSS is a collection of Web formats used for publishing updates of dynamic web sites, portals and services such as blogs entries, headlines, audio and video, and other resources, in a standardized format. RSS is a kind of content aggregation where a particular section of a website is shared amongst various other websites. Many of the Operating systems are RSS centric by providing users access to Weather, News and Sports updates right on their Desktop
  27. Facebook News Feed
  28. Advantages of RSS When any information is published on a particular website then it is automatically propagated to all the subscribed user in a timely fashion. All information at one place. It benefits the user as well as the publisher. The publisher can publish easily and does not have to maintain a database thus saves cost of saving information. It saves the search cost and time. It can be used for advertisement. It is noise free and spam free as the user subscribes for such kind of a service.
  29. Disadvantages of RSS Web content publishers are unaware of the number of users using their information RSS Feeds can be responsible for heavy load on the server. User can get overwhelmed with information if not filtered appropriately
  30. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Social Web  Time and Search Costs Security, Privacy and Policy Concerns Intelligence Information Retrieval, Extraction Information Presentation or Content Aggregation Which Technology is better ?
  31. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Social Web Since the Web has become more social over the period of time, Tagging and RSS Feeds play an important role in making the Web more social. Tagging through Folksonomy categorizes tags of people with identical vocabulary together. RSS helps in information publishing and sharing. Since RSS will eventually land up in being part of Mashup Technology, Mashups have an indirect contribution to the Social Web
  32. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Time and Search Costs When it comes to search costs Tagging, Mashup and RSS all of them are quiet successful in saving time and search cost.
  33. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Security, Privacy and Policy Concerns Although Mashups are quite popular among web users, they expose users to a variety of security risks Mashup sites can further be used for their components as data sources by other mashup websites. This makes it difficult to figure out as to how each mashup component is being used. Since RSS are eventually a part of Mashup and function under a similar paradigm, it too raises security, privacy and policy concerns Tagging have no impact on security, privacy and policy concerns.
  34. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Intelligence Tagging and RSS Feeds have begun to exhibit more intelligent behavior. Mashups which just aggregate Web Components. Various algorithms are being implemented to search and filter tags based on Folksonomy.
  35. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Information Retrieval, Extraction Mashup can be categorized as a information retrieval technology. Tagging facilitates information retrieval and through its concept of Folksonomy it facilitates in further filtering and information extraction. RSS can be categorized in both, information retrieval and information extraction technologies.
  36. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Information Presentation or Content Aggregation Only Mashup and RSS can be categorized in web content presentation technologies
  37. COMPARISON BETWEEN TAGGING (FOLKSONOMY), MASHUP AND RSS Which Technology is better ? Mashups, RSS and Tagging have all contributed in fetching information and presenting it to the web user in a faster and effective way. Quantifying or calling one technology better than the other will not provide a strong and collaborative solution of enhancing information retrieval and presentation
  38.  SOLUTION FOR A BETTER WEB CONTENT RETRIEVAL AND PRESENTATION MECHANISM It is understood that the independent usage of either Mashup, Tagging and RSS will not provide an optimum solution Focus on the advantages of the technologies With the help of intelligent algorithms for clustering of tags in folksonomy, web content retrieval and filtering will become much more quicker and will reduce irrelevant information and noise retrieval.
  39.  SOLUTION FOR A BETTER WEB CONTENT RETRIEVAL AND PRESENTATION MECHANISM Further by applying adaptive algorithms to the RSS, it will result in specific information to be propagated to the user based on the users behavior Eliminating the issues regarding the security, privacy and policy concerns of Mashups, all of this will eventually make web content retrieval, aggregation and presentation a stroll in the park
  40. References [1] S.E. Madnick, M.D. Siegel, Seizing the Opportunity: Exploiting Web Aggregation, MISQ Executive, 1(1),2002, 1-12. [2] Hongwei Zhu, Stuart E. Madnick, Michael D. Siegel, " The Interplay Of Web Aggregation And Regulations", CISL WP #02-17, November 2002. [3] Nikola Vlahovic, " Web 2.0 and its Impact on Information Extraction Practices", Proceedings of the International Conference on Applied Computer Science. [4] R. Baeza-Yates, B. Ribeiro-Neto, “Modern Information Retrieval", ACM Press, 1999.–64. [5] http://www.sciencemag.org/content/325/5942/828.full.pdf [6] ZhichenXu, Yun Fu, Jianchang Mao, and Difu Su, " Towards the Semantic Web: Collaborative Tag Suggestions" [7] OhadGreenshpan, "Harnessing Data Management Technology for Web Mashups Development" [8] Scott A. Golder and Bernardo A. Huberman, " The Structure of Collaborative Tagging Systems ". [9] C. C. Tsai, C.-J. Lee, S.-M. Tang, The Web 2.0 Movement: MashUps Driven and Web Services, In the Proceedings of the 13th WSEAS International Conference on COMPUTERS, WSEAS Press, Athens, Greece, 2009, pp. 646 - 651. [10] Aaron Bohannon, "Building Secure Web Mashups", July 16, 2008
  41. THANK YOU
More Related