220 likes | 383 Views
Facilitating Data Integration: A Data Conversion Function Service. Mario Martínez Gómez Supervisor: Mariano Cilia Prof. Alejandro P. Buchmann. Facilitating Data Integration: A Conversion Function Manager Service. 1. Contents.
E N D
Facilitating Data Integration: A Data Conversion Function Service Mario Martínez Gómez Supervisor: Mariano Cilia Prof. Alejandro P. Buchmann
Facilitating Data Integration: A Conversion Function Manager Service 1 Contents Contents Motivation Language Infrastructure Conclusions & Future Work • Motivation • Proposed Approach • A Conversion Function Definition Language • The Conversion Manager Service • Conclusion & Future Work
Facilitating Data Integration: A Conversion Function Manager Service 2 ContentsMotivation Language Infrastructure Conclusions & Future Work Motivation (I) • Current Information Revolution. Everything is information-oriented. • Computers & Applications collaborate with each other by exchanging data. • Traditional User’s role is assumed by Comps. & Apps. • “human beings” can derive some data context • Computers cannot!
Facilitating Data Integration: A Conversion Function Manager Service 3 ContentsMotivation Language Infrastructure Conclusions & Future Work Motivation (II) • Data Exchange Common Vocabulary required • But a common vocabulary is not sufficient • Applications working at different locations may assume different Contexts about data data conversion (Conversion Functions)
Facilitating Data Integration: A Conversion Function Manager Service 4 ContentsMotivation Language Infrastructure Conclusions & Future Work Many Vocabularies. Many Contexts € € PricePerUnit Context: USA <Price, 100, “€”> Vocabulary of Lufthansa <PricePerUnit, 100, “€”> Vocabulary of IBERIA Price <Price, 100, “€”> Vocabulary of KLM Context: EUROPE
Facilitating Data Integration: A Conversion Function Manager Service 5 ContentsMotivation Language Infrastructure Conclusions & Future Work Motivation (III) • Previous Approaches: • Common Context implicitly assumed everywhere • Data Conversion Functions scattered among participating applications • Same Conversion Functions written many times and in many programming languages • New participants with previously unknown context may impact on apps • Data Conversion Functions attached to the exchanged data • Data and behaviour together
Facilitating Data Integration: A Conversion Function Manager Service 6 ContentsMotivation Language Infrastructure Conclusions & Future Work Our Approach • Eliminate Conversion Functions (CF) from participants applications • Offer Conversion Functions as a Service (CFM) • CFs are written only once • CFs are reusable • Simplify the definition of CFs according to a classification • Minimisation of the transmitted data
EURO Facilitating Data Integration: A Conversion Function Manager Service 7 ContentsMotivation Language Infrastructure Conclusions & Future Work Common Vocabulary assumed CFM Service <PricePerUnit, 12.34, “EURO”> < 12.34, “€”, “US$”> convert <PricePerUnit, 12.34, “EURO”> 19 CONTEXT EUROPE CONTEXT USA
Facilitating Data Integration: A Conversion Function Manager Service 8 ContentsMotivation Language Infrastructure Conclusions & Future Work Our Approach (II) • Two main aspects we take into account • Specification of data conversions (Language) • Conversion Manager service’s infrastructure
Facilitating Data Integration: A Conversion Function Manager Service 9 Contents Motivation Language Infrastructure Conclusions & Future Work Language • Simple language oriented for the specification of CF • Homogenize the description of CF • Avoid re-writing the same function in different prog. langs • Reducing development and maintenance costs • Different kinds of conversion: • Mathematic transformation-based (Time independent) • Time dependent • Lossy transformations • String handling • Mapping tables
km m dm cm in mi Facilitating Data Integration: A Conversion Function Manager Service 10 Contents Motivation Language Infrastructure Conclusions & Future Work Language Elements Tables mapping among Strings (Synonyms, implicit graph construction) • Measurement Systems’ conversion (e.g. imperial metric) • Graph: specify conversion among elements of systems • Scale: conversion within a system • (e.g. m dm : scale = 10) • Bridge: conversion between systems • (e.g. in m : bridge = 0.0254) 1000 10 decimeter kilometer meter 10 map length { km[LC] kilometer [FN]; dm[LC] decimeter [FN]; in[LC] inch[FN]; mi[LC] mile[FN]; m[LC] meter[FN]; cm[LC] centimeter[FN] } length { #scale 10; #default_ctx LC km (1000)m dm cm mile (63,360)in in (0.0254)m } 0,0254 centimeter inch mile 63,360
Temperature { #default_ctx LC C (( #input - 32 ) * 5 / 9) F K (#input + 273.15) C RA (#input * 1.8) K R (#input * 0.8) C export map temp_codes { C[LC] CELSIUS[FN]; KELVIN[FN] K[LC]; R[LC] REAUMUR[FN]; RA[LC] RANKINE[FN] } } Length { #scale 10 km (1000)m dm cm mm (1000)um } Conversion function call: convert 10 celsius into reaumur convert(Temperature , 10,CELSIUS, R) returns 8 #input Facilitating Data Integration: A Conversion Function Manager Service 11 Contents Motivation Language Infrastructure Conclusions & Future Work Time Independent convert(functionName, inputValue, sourceCtx, targetCtx) Conversion Function for some temperature units Conversion Function for the metric system Conversion function call: convert 10 km into m convert(Length, 10, km, m) returns 10000
Facilitating Data Integration: A Conversion Function Manager Service 12 Contents Motivation Language Infrastructure Conclusions & Future Work Time Dependent currency { var src_ctx, dst_ctx; /* The external currency's conversion service works in 3LC context */ src_ctx = $$currency[][3LC](#src_ctx); dst_ctx = $$currency[][3LC](#dst_ctx); #connect(currencyExternService.cfg, [#input, src_ctx, dst_ctx]); map currency { EUR[3LC] EURO[FN] €[SYMBOL] LEU[3LC] LEU[FN] LEU[SYMBOL] USD[3LC] DOLLAR[FN] $[SYMBOL] } } convert 17 € into $ : convert(currency, 17, EURO, $) #dst_ctx #src_ctx #input
Facilitating Data Integration: A Conversion Function Manager Service 13 Contents Motivation Language Infrastructure Conclusions & Future Work Lossy Transformations convert X from context EUR into context ESP: convert(clothing_size,X,EUR,ESP) returns 41-60 convert 4 from context USA into context ESP: convert(clothing_size,4,USA,ESP) returns 10 map clothing_size { ^0-10[ESP] S[EUR] 4[USA] _11-20[ESP] M[EUR] 6[USA] =21-40[ESP] L[EUR] 8[USA] 41-60[ESP] X[EUR] 10[USA] } clothing_size { $$clothing_size[#src_ctx][#dst_ctx](#input) }
Facilitating Data Integration: A Conversion Function Manager Service 14 Contents Motivation Language Infrastructure Conclusions & Future Work String Handling Features Conversion function among date formats converts date input from USA format into GER format convert(date, “07/01/2005”, USA, GER) returns 01.07.2005 date { #separators “/“, “.” #allowed_ctx “USA”, “GER”, “SPA” var month, day, year ; year =$2 ; if (#src_ctx = “USA") {month=$0 ;day=$1} else {month=$1 ;day=$0} if (#dst_ctx = “USA") return month “/“ day “/“ year else if (#dst_ctx = “GER") return day “.“ month “.“ year else if (#dst_ctx = “SPA") return day “/“ month “/“ year }
CLIENT MASTER LIBRARY SERVER SIDE Facilitating Data Integration: A Conversion Function Manager Service 15 Contents Motivation Language Infrastructure Conclusions & Future Work Components of the Infrastructure REPOSITORY • Client-side • Caching of Conversion Functions • Warm-up cache of Conversion Functions • Connection Management CLIENT SIDE
LIBRARY LIBRARY FORWARDER FORWARDER REPOSITORY REPOSITORY REPOSITORY MASTER MASTER MASTER Facilitating Data Integration: A Conversion Function Manager Service 16 Contents Motivation Language Infrastructure Conclusions & Future Work Components of the Infrastructure (II) • Server-side • Forwarder: transparent connection to servers • Load balance, No single point of failure • Server: Warm-up caches, petition´s relay (with/without load) CLIENT CLIENT
Facilitating Data Integration: A Conversion Function Manager Service 17 Contents Motivation Language Infrastructure Conclusions & Future Work Architecture Code Generator Syntax Checker EJB API INSERT CLIENT APPLICATION Conversion Function Repository CONVERT WS API Access Layer Functionality Layer Conversion Manager
Facilitating Data Integration: A Conversion Function Manager Service 18 Contents Motivation Language Infrastructure Conclusions & Future Work • Conclusions • The cost of data integration is minimized • Conversion Functions have been unified and formalized • Conversion functions are now defined just once • Conversion-oriented language • Infrastructure has been built • Smoothly integrated within the Java platform • Future Work • Administrative interface using the given API • Multiple Vocabularies support • Benchmarks
protocol = WS service_endpoint = "http://nagoya.apache.org:5049/axis/servlet/AxisServlet" Operation_name = "doEcho" Connection File protocol = EJB jndi_factory = "org.jnp.interfaces.NamingContextFactory" jndi_provider = "jnp://localhost:1099" jndi_factory_pkgs = "org.jboss.naming:org.jnp.interfaces“ service_endpoint = "ConversionElementAccessBean" operation_name = "doEcho"
FlightOffer Complex Object ClassOfService String; DOMAIN = {Y,U,D} Price Real Currency String; DOMAIN={USD, EUR,...} Scale Integer FlightSegment Complex Object FlightNumber Integer ... ... … < FlightOffer, { <ClassOfService, "Y", {<ClassOfServiceCode,"OneLetterClassCode">} >, < Price, 1430, {<Currency, "USD">, <Scale, 1> } >, < FlightSegment, { < FlightNumber, 400 >, < AirlineIdentifier, "Lufthansa", {<AirlineIdentifierCode, "FullAirlineName"> }>, < DepartureDate, "Jun 06, 1998", {<DateFormat, "Mon DD, YYYY"> }>, < DepartureTime, "10:35 AM", {<TimeFormat, "HH:MM AM/PM"> } >, < DepartureAirport, "FRA", {<AirportIdentifierCode, "ThreeLetterCode"> }>, < ArrivalAirport, "JFK", {<AirportIdentifierCode, "ThreeLetterCode"> }>, < ArrivalTime, "01:00 PM", {<TimeFormat, "HH:MM AM/PM">} >, < Distance, 3850, {<Unit, "mile">, <Scale, 1> } > } > }> A Complex Conversion Function Invocation cv(flight_offer, {<Currency, EUR>,<scale,10>}) == cv(ClassOfService, {<Currency, EUR>,<scale,10>}) + cv(Price, {<Currency, EUR>,<scale,10>}) + cv(FlightSegment, {<Currency, EUR>,<scale,10>}) == .... .... convert(convert(1430, Currency, USD, EUR), Scale, 1, 10) == convert(convert(1120, Scale, 1, 10) == 112 .... ....