200 likes | 348 Views
Microdata in HTML 5.0. Technologies for Web Application Development Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic. Machine-readability.
E N D
Microdata in HTML 5.0 Technologies for Web Application Development Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic
Machine-readability • machine readabilityof HTML page means capability of machines to interpret data on that page • HTML 5.0 elements allow for machine readability only partly, e.g. • time element • address element • we could continuously standardize new and new semantic elements … • … but it would be wrong • it is not possible to standardize everything • we need a way for free extensibility towards machine readability
Machine-readability <div> Hi, I’m Martin Nečaský and I work at Charles University in Prague. </div> <h2>Interesting Events</h2> <section> <div> <a href="http://www.cs.vsb.cz/dateso/2011/">DATESO 2011</a> </div> <div>After a year, we will again meet at DATESO 2011. We ...</div> <ul> <li><b>When:</b> <timedatetime="2011-04-22">April 20th 2011</time> to <timedatetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address>Hotel OtavArena, Burketova 303, 397 01, Písek</address> </li> </ul> </section>
Machine-readability • it is useful to increase machine-readability of your web pages by annotating content with machine-readable values • How to achieve machine-readability? • microformats • microdata • RDFa
Microdata for machine-readability • microdataallows nested groups of name-value pairs to be added to documents in addition to “classical” content • basic microdata concept is item • group of name-value pairs • item ~ object • name-value pair ~ attribute value • value: atomic value (integer, string, date, …) or another item
Microdata for machine-readability • attribute itemscope • any HTML element having this attribute represents single item
Microdata for machine-readability <div itemscope> Hi, I’m Martin Nečaský and I work at Charles University in Prague. </div> <h2>Interesting Events</h2> <section itemscope> <div> <a href="http://www.cs.vsb.cz/dateso/2011/">DATESO 2011</a> </div> <div>After a year, we will again meet at DATESO 2011. We ...</div> <ul> <li><b>When:</b> <time datetime="2011-04-22">April 20th 2011</time> to <time datetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address itemscope> Hotel OtavArena, Burketova 303, 397 01, Písek </address> </li> </ul> </section>
Microdata for machine-readability • each item has its type • type ~ class, item ~ class instance • attribute itemtype • type should be from a standardized vocabulary • otherwise no one will be able to interpret our content • e.g. http://data-vocabulary.org • http://data-vocabulary.org/Person • http://data-vocabulary.org/Address • http://data-vocabulary.org/Event • http://data-vocabulary.org/Product • http://data-vocabulary.org/Review
Microdata for machine-readability <div itemscopeitemtype="http://data-vocabulary.org/Person"> Hi, I’m Martin Nečaský and I work at Charles University in Prague. </div> <h2>Interesting Events</h2> <section itemscopeitemtype="http://data-vocabulary.org/Event"> <div> <a href="http://www.cs.vsb.cz/dateso/2011/">DATESO 2011</a> </div> <div>After a year, we will again meet at DATESO 2011. We ...</div> <ul> <li><b>When:</b> <time datetime="2011-04-22">April 20th 2011</time> to <time datetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address itemscopeitemtype="http://data-vocabulary.org/Address"> Hotel OtavArena, Burketova 303, 397 01, Písek </address> </li> </ul> </section>
Microdata for machine-readability • each item has set of properties • name/value pairs • attribute itemprop • property name should be from a standardized vocabulary • otherwise no one will be able to interpret our data • e.g. item of type http://data-vocabulary.org/Person has following properties: • name • photo • url • friend • Address: street-address, city, region, postal-code, country-name • …
Microdata for machine-readability <div itemscopeitemtype="http://data-vocabulary.org/Person"> Welcome,<br/> my name is <span itemprop="name">Martin Nečaský</span> and I work as an <span itemprop="title">assistant professor<span> at <span itemprop="org">Charles University</span> in Prague. </div>
Microdata for machine-readability • and item of type http://data-vocabulary.org/Event has following properties: • summary • url • startDate • endDate • location • Description • …
Microdata for machine-readability <section itemscopeitemtype="http://data-vocabulary.org/Event"> <div> <a itemprop="url" href="http://www.cs.vsb.cz/dateso/2011/"> <span itemprop="summary"> DATESO 2011 </span> </a> </div> <div itemprop="description"> After a year, we will again meet at DATESO 2011. We ... </div> <ul> <li><b>When:</b> <time itemprop="startDate" datetime="2011-04-22">April 20th 2011</time> to <time itemprop="endDate" datetime="2011-04-24">April 24th 2011</time> </li> <li><b>Where:</b> <address itemprop="location" itemscopeitemtype="http://data-vocabulary.org/Address"> Hotel OtavArena, <span itemprop="street-address">Burketova 303</span>, <span itemprop="postal-code">397 01</span>, <span itemprop="locality">Písek</span> </address> </li> </ul> </section>
Microdata for machine-readability • let’s get a deeper insight – where is property value for a property? • let E be element with itempropattribute. Then value V is determined by following table
Microdata for machine-readability • let’s get a deeper insight – what if data is separated from my item (e.g. due to page layout)? • no problem, use itemrefattribute to refer other elements with item properties
Microdata for machine-readability <div itemscopeitemtype="http://data-vocabulary.org/Person" itemref="myfriendsAmyfriendsB"> Welcome,<br/> my name is <span itemprop="name">Martin Nečaský</span> and I work as an <span itemprop="title">assistant professor<span> at <span itemprop="org">Charles University</span> in Prague. </div> <div id="myfriendsA"> My friends:<br/> <a itemprop="friend" href="http://john.black.com">John Black</a><br/> <a itemprop="friend" href="http://bill.white.com">Bill White</a> </div> <div id="myfriendsB"> My other friends:<br/> <a itemprop="friend" href="http://joe.pink.com">Joe Pink</a> </div>
Projects • Who considers working with machine-readable web pages? • Google Rich Snippets • Yahoo SearchMonkey • Facebook Open Graph Protocol • DBPedia.org • … • Are there any other standards? • http://www.foaf-project.org/ • http://trac.usefulinc.com/doap • http://www.heppnetz.de/projects/goodrelations/ • …
Projects - Google Rich Snippets • consume microformats, microdata, RDFa • data from your web pages may be used by Google for searching and displaying search results
Projects – Yahoo SearchMonkey • consume microformats • project has been stopped in 2010 • but it is integrated to Yahoo and Microsoft products and further developed • http://www.ysearchblog.com/2010/08/17/news-about-our-searchmonkey-program
Projects – FB Open Graph Protocol • enables integrate your web pages into FB social network • own way to achieve machine-readability (via meta elements) <html xmlns="http://www.w3.org/1999/xhtml" xmlns:og="http://ogp.me/ns#" xmlns:fb="http://www.facebook.com/2008/fbml"> <head> <title>The Rock (1996)</title> <meta property="og:title" content="The Rock"/> <meta property="og:type" content="movie"/> <meta property="og:url" content="http://www.imdb.com/..."/> <meta property="og:image" content="http://ia.media-imdb.com/..."/> <meta property="og:site_name" content="IMDb"/> <meta property="fb:admins" content="USER_ID"/> <meta property="og:description" content="A group of U.S. Marines, under ..."/> ... </head> ... </html>