140 likes | 307 Views
Damia: Data Mashups for Intranet Applications. David E. Simmen, et al IBM Almaden Research Center Presented by John Nielsen. Damia : DAta Mashups for Intranet Applications Web 2.0 : Focus on applications, collaboration and interaction (rather than pages and browsing)
E N D
Damia: Data Mashups for Intranet Applications David E. Simmen, et al IBM Almaden Research Center Presented by John Nielsen
Damia: DAta Mashups for Intranet Applications Web 2.0: Focus on applications, collaboration and interaction (rather than pages and browsing) AJAX: Asynchronous JavaScript and XML. Umbrella term for web development techniques that utilize background data transfer and scripted client-side applications ATOM: XML-based web syndication (feed) format standard nominally intended to replace RSS XML: eXtensible Markup Language JSON: JavaScript Object Notation. Lightweight data structure transmission format RSS: Really Simple Syndication. XML-based web syndication (feed) format REST: REpresentational State Transfer. Umbrella term for simple interfaces used to transfer data via (e.g.) HTTP without another messaging layer PHP: PHP Hypertext Preprocessor. Server-side scripting language that can be embedded in HTML Ruby on Rails: complete framework for building database-backed web applications with (intended) relative ease API: Application Programmer Interface. Abstraction of the functions, classes, etc. in a program or library that are available to other programs GUI: Graphical User Interface URL: Universal Resource Locator LAMP stack: Originally Linux, Apache, MySQL, PHP. Generalized to mean any solution stack comprised of free/open source software used to run a web application server Zend Framework: Open source, object-oriented web application framework written in PHP DB2: a family of IBM relational database products Terms and Acronyms (in order of appearance)
MySQL: a freely available relational database Dojo toolkit: Tools and utilities for creating AJAX/JavaScript applications. LOB: Line of business(?) ADM: Augmentation-level Data Model Xquery: query language for extracting data from XML documents XDM: Xquery Data Model FLWOR expression: For, Let, Where, Order by, Return. Style of Xquery query that performs projections and joins on one or more XML sources and returns a sorted list of results Closed operator: Given an operator (or function or transformation) and a domain (or set of inputs), the operator is closed under the domain if for every member of the domain the result of the operation is also a member of the domain MIME types: Multipurpose Internet Mail Extensions. A standard and extensible set of document types used to identify the content of e-mail and html docs CSV: Comma-Separated Values. A simple text format for tabular data where fields (or columns) are delimited by a comma and records (or rows) are delimited by a newline character DOM: Document Object Model. Standard model for representing XML documents as objects Curl: Open source tool and libraries for retrieving remote files and documents via URL using HTTP, FTP and other protocols GNR: IBM Global Name Recognizer (phonetic similarity) EII: Enterprise Information Integration ETL: Extract, Transform, Load. A method of pulling data into a warehouse RDF: Resource Description Framework. Simple model for describing metadata for e.g. the semantic web Terms and Acronyms (continued)
Motivation • Business leaders want “situational” applications that use data from many sources, some of them nontraditional • Web technologies have evolved to allow information exchange and collaboration using lightweight standards—web services, Web 2.0, etc. • Damia uses modern web technology to allow business users to create “mashups” using whatever data sources they choose
How does it work? • User specifies sources either as existing RSS/ATOM feeds or as custom feeds • Custom feeds can be created by uploading documents of a known type (CSV) or through a data-source-specific connector • User specifies filters, join conditions and other transformations using a fixed set of operators • Damia converts the user input to an XML-formatted Mashup specification • On execution the Augmentation engine reads the sources, performs the mashup operations, and publishes the result as a new RSS/ATOM feed • Mashup results intended to be requested and consumed by other applications
Source: Import data to the mashup Combine: Create one feed from two or more inputs Filter: Output only entries from the input satisfying certain conditions For Each: Place the values from one operator into the URL parameter for another operator, return results Group: Organize entries into categories based on a specified element Merge: Join two inputs based on certain match conditions Publish: Specify output format of the mashup Sort: Re-order entries based on a specified field value Transform: Modify entries from the input based on specified text or math functions Available Operators (from MashupHub)
Usage Scenarios (from paper) • Customer Service • Receive name suggestions from phonetic similarity matcher (source, transform) • Look up matches in customer service DB (source, merge (augment)) • Adjust output to desired format (transform) • Publish
Usage Scenarios (continued) • Weather Alerts for Insurance Agent • Upload insurance data spreadsheet (source, via custom feed) • Identify zip codes from spreadsheet (filter) • Import weather alerts from NWS (source) • Compare/match zip codes from spreadsheet w/ zip codes from weather alerts (merge (augment)) • Publish formatted list of customers likely to be affected by severe weather (transform, publish)
Demonstration • https://greenhouse.lotus.com/mashuphub (free login required)
Future Work • Data Standardization • Continuous mashups—true mashup subscription rather than just on-demand • Additional data import connectors • In-depth search • Data quality (or pedigree)