1 / 76

The Evolving Semantic World

The Evolving Semantic World. Barbara McGlamery Taxonomist Martha Stewart Living Omnimedia. About me. Masters in Library and Information Science Long Island University New York Public Library Branch librarian NYPL for the Performing Arts – Drama reference Entertainment Weekly Data Manager

oksana
Download Presentation

The Evolving Semantic World

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Evolving Semantic World Barbara McGlameryTaxonomistMartha Stewart Living Omnimedia

  2. About me • Masters in Library and Information Science • Long Island University • New York Public Library • Branch librarian • NYPL for the Performing Arts – Drama reference • Entertainment Weekly • Data Manager • Time Inc. • Senior Data Manager, Taxonomist, Metadata Architect, Ontologist • Martha Stewart Living Omnimedia • Taxonomist

  3. agenda • What is the Semantic Web? • Big “S” and little “s” semantics • What we used to believe • Time Inc. & the theory of overkill • What we know now • Martha Stewart and the theory that less is more • Where we’re going • Leaner and meaner (but more standards)

  4. What is the Semantic Web?

  5. The Semantic Web is a web of data…. (it) provides a common framework that allows data to be shared and reused across applications, enterprise, and community boundaries. --w3c

  6. "The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” • --Tim Berners-Lee, James Hendler, and Ora Lassila, Scientific American, 2001

  7. The Semantic Web is about making knowledge machine and human-readable

  8. ---- AmitAgarwal http://www.labnol.org/internet/web-3-concepts-explained/8908/

  9. Big Ssemantic web • Little ssemantic web

  10. Big SSemantic Web …big "S" web technologies provide a framework for describing data on a web page when the data on the website is published. If data is read or captured, because the data's semantic meaning has already been described, you don't have to go through the process of understanding the meaning of the data after the fact. --Sean Martin, CEO of Cambridge Semantics

  11. Little s Semantics Little "s" web technologies capture and filter data with no description or understanding of the data provided after the capture process. The process of understanding the meaning of that data starts once data capture has happened. People have to intervene to provide the context and meaning for language on the web. --Sean Martin, CEO of Cambridge Semantics

  12. Big S– W3C approved standard • Little s • Looser groups of unaffiliated • standards

  13. Big S semantics

  14. Essentials of Big S Semantic Web • URI – Uniform Resource Identifier • RDF – Resource Description Framework • OWL – Web Ontology Language • Semantic reasoner (inference engine)

  15. URI – Uniform Resource Identifier • Way to identify things • Images, pages of text, locations • De-referenceable • Freebase • http://www.freebase.com/view/en/will_smith • URI’s are unique, no two are the same • Will Smith • http://www.freebase.com/view/en/will_smith

  16. RDF – Resource Description Framework • Framework used to describe relationships between objects • Extends and formalizes XML • Subject>Predicate>Object

  17. RDF – Resource Description Framework Subject>Predicate>Object >> >>> is the lead actor >>>>>> Will Smith Bad Boys http://ew.com/PersonsTax/Will_Smith http://ew.com/EntertainmentOnt/leadPerformanceIn http://ew.com/EntertainmentTax/Movies/Bad_Boys

  18. OWL – Web Ontology Language …designed to be used by applications that need to process the content of information instead of just presenting it to humans -- W3C

  19. OWL – Web Ontology Language • Metadata model • Extends RDF to further define properties • Ex:Equivalent relationships >> >>> is married to >>>>>> >> >>> is married to >>>>>>

  20. Semantic reasoner • Software able to infer logical consequences from a set of asserted facts • Follows inference rules specified by OWL properties • Inverse • Transitive • Symmetric • Functional/Inverse functional • Equivalent

  21. Putting it all together • Ontology • Rule set • Classes and Properties • Taxonomy • Application of Rule Set • Tags and Relationships • Everything is a statement • Subject>Predicate>Object Ex: Will Smith is lead performer in Bad Boys

  22. Benefits of RDF/OWL • Persistent URIs • Verifiable XML • Unambiguous Relationships • Polyhierarchy • Interoperability

  23. Limitations of RDF/OWL • Difficult to propagate across web • Challenge to integrate with legacy systems • Expensive queries • No “Killer App”

  24. Semantic Web Layer cake

  25. Little s Semantics

  26. RDFa- Resource Description Framework (in) Attributes • W3C recommendation that adds a set of attribute-level extensions to XHTML for embedding rich metadata within Web documents • Easy to implement • Not HTML 5 compliant

  27. RDFa: Best Buy

  28. Linked open data 2007 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

  29. Linked Open Data 2010 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

  30. Microformats • Semantic markup which seeks to re-use existing HTML/XHTML class attributes to structure data • Easy to implement • Limited formats

  31. Microformats: Bon Appétit

  32. Microdata • AWHATWG HTML5 specification used to nest semantics within existing content on web pages • Officially supported by Bing, Yahoo, & Google • Can imbed other markup languages like RDFa, microformats, and Dublin Core • Not well-known (yet)

  33. Microdata:Steve: the museum social tagging project

  34. Open Graph Protocol • Facebook-created markup language that turns any web page into an Open Graph Objects allowing for any page to become a Facebook page • I “Like” you • Good for targeted advertising • Limited in scope

  35. OGP: Martha Stewart

  36. Back-of-the-napkin comparison

  37. Status report on S Semantic Web • Linked Open Data graph growing • Many countries have developed government sites with rich semantics • Development of Semantic search • More widespread adoption of lighter semantics

  38. Where we might be going • Pharmaceutical industry identifies trends across clinical studies, and not just within them • News industry better targets content by locale • Department of Defense using it to make better decisions in the field • Utilized in advertising to drive more and more revenue

  39. What we used to believe

  40. Time Inc. and TOPICS

  41. Time Inc • Largest magazine media company in U.S. • 48 websites worldwide • Websites attract more than 50M unique visitors each month • Domains includes lifestyle, entertainment, style, news, sports, and business • Early adopter (2005-2006) of SW technologies

  42. Goals • Enhance data integrity • Improve editorial efficiency • Create contextual presentation of content • Develop relationships that cannot be derived from content • Share resources among titles • Improve search and facilitate guided navigation

  43. Challenges • Aging CMS with sites on different versions • Many different domains • Scalability to accommodate volume of data and development of complex relationships • Lack of resources, money, and time

  44. Why we need controlled vocabularies (or why freeform keywords just don’t work) • Star Wars: Episode I -- The Phantom MenaceEpisode 1Episode IPhantom MenaceStar Wars Episode I The Phantom MenaceStar Wars Episode I: The Phantom MenaceStar Wars prequelStar Wars: Episode 1 -- The Phantom MenaceStar Wars: Episode i -- the Phantom MenaceStar Wars: Episode I: The Phantom MenaceStar Wars: Episode I--The Phantom MenaceStar Wars: Episode I--The Phantom MenanceStar Wars: Episode One -- The Phantom MenaceStar Wars: The Phantom MenaceStar Wars: The Phantom Menace -- Episode IThe Phantom MenaceThe Phanton Menace Star Wars: Episode I -- The Phantom Menace

  45. What standard to adopt? • RDF • Flexible • Scalable • Fits business needs • New technology but industry standard • Microformats • Easy to implement • No inferencing • Solved some business needs but not all • No standards • Limited formats

  46. Search for vendors • In 2005 fewcommercial RDF/OWL tool available that fit our needs • Open source reasoners like Jena and a proprietary design seemed more cost-effective and realistic

  47. TOPICS • Time Ontologies for Publishing, Inference, Classification and Semantics

  48. What is TOPICS? • Librarian Tool – allows librarians to create resources and properties • Relationship Tool - generates unambiguous connections between data • Classification Tool - allows editors to add uniform, structured metadata to content • Semantic reasoner - finds new facts from existing data • Query Engine - manages logical retrieval of data

  49. Technical Details of System • Java application • Jena semantic reasoner • Joseki query engine • Sybase database

More Related