150 likes | 249 Views
Data-gov @ RPI. Li Ding, Jim Hendler and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute July 27, 2010.
E N D
Data-gov @ RPI Li Ding, Jim Hendler and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute July 27, 2010 The Data-gov project is headed by Professors Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson ,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores .
Raw Government Data Now data.gov online “Open Government Directive” released data.gov relaunch with semantic web featured January 1, 2009 “Openness will strengthen our democracy and promote efficiency and effectiveness in Government.” --- President Obama December 8, 2009 May 21, 2010 May 21, 2009 2009 2010 … January 19, 2010 June30,2009 Putting Government Data online data.gov.uk online
Semantic Web featured at data.gov http://www.data.gov/semantic/ http://www.data.gov/semantic/data/alpha • leveraged contributions from the Tetherless World Constellation at RPI • published 6.4 billions of triples (almost doubled LOD cloud – 13 billion triple in total) • hosted triple store (virtuoso) and open source RDF mashups
Data-gov Wiki: Portal for Innovations at RPI The Data-gov Wiki explores and educates the use of semantic web technologies, esp. linked data, in producing, processing and utilizing government data from data.gov. 40+ Demos 400+ Datasets Tutorials & Videos The Data-gov Wiki is run by the Tetherless World Constellation at RPI, headed by Professors Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson ,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores .
Synopsis • Open Data: available for public use • Linked Data: easy to integrate • Visualization: easy to understand data • Mashups: enrich meaning of data • Provenance: make mashups accountable
A Typical Mashup: CASTNET Data.gov epa.gov CASTNET Ozone (CSV) CASTNET Site (CSV) 1 Convert raw dataset into linkable RDF 4 surf to EPA applications query multiple RDF dataset via SPARQL end point 2 3 drill down for details Visualization API Exhibit Data Mashup Web Application Mashup Visualization Mashup Created by Dominic DiFranzo, PhD student at RPI, http://www.data.gov/semantic/Castnet/html/exhibit
Mashup: AGI vs Medicare Claims [Spatial Mashup] Data.gov (AGI + Medicare Claims + Population) Created by Peter Coons, http://data-gov.tw.rpi.edu/demo/stable/demo-1356-1623-health-claim-vs-income.html
Mashup: US and UK Foreign AID Data Sources: [Spatial Mashup] Data.gov (USAID) + Data.gov.uk (DFID) Created by James Michaelis, PhD student at RPI, http://data-gov.tw.rpi.edu/demo/linked/aidviz-1554-10030.html
Social Mashup: US Wildland Fire Category:Wildfires In The United States Budget on wildfire “DOI” and “USDA”(OMB) Wildland fire (NIFC) [Temporal Mashup] Data.gov (statistics+ budget) + Wikipedia (famous fires) Created by Li Ding, researcher at RPI, http://data-gov.tw.rpi.edu/demo/stable/demo-1187-40x-wildfire-budget.html
Mashup: White House Visitor Search Wikipedia dbpedia:Barack_Obama Data-gov Wiki “POTUS” whitehouse [Person Mashup (via Data-gov Wiki)] Data.gov (statistics) + Wikipedia (personal profiles) Created by Dominic DiFranzo, http://data-gov.tw.rpi.edu/demo/stable/white-house-visitor/top100-visitees.php
Mashup: USPS Spending and News [Temporal Mashup] Data.gov (spending and budget) + User-contributed Data (news) Created by Sarah Magidson, http://data-gov.tw.rpi.edu/demo/linked/demo-401-usps-news.html
Mashup: Supreme Court Justices [Person Mashup] Data.gov (budget) + SCDB (Voting History) + Wikipedia (personal profiles) Created by Xian Li, http://data-gov.tw.rpi.edu/demo/stable/supremeCourt/demo-10016-portal.html
More Mashups: Using Web Tools SPARQL results (XML) can be converted into other formats (e.g. JSON, CSV) as input of other Web tools: Yahoo Pipes, IBM Many Eyes, Microsoft Web n-gram Service, …
More Mashups: Provenance • Critical to accountability • Demo => Dataset => Agency • Where data come from? • Agency =>Dataset => Comments • Support users’ feedback Agency Dataset Demo
Conclusions • Now • 6.4 billions of triples from data.gov • “data + visualization + mashup” is powerful • Low-cost solutions available for education • Future • Development • More raw data, data catalog, links, RDFa • More tools, esp. Web visualization, SPARQL endpoint • More demos and applications in different domain • Research • Integration: link, search, social contribution,… • Provenance: source, versions, trust, … • Usability: scalable, quality…