490 likes | 602 Views
Using i* modeling for the multidimensional design of data warehouses. Jose-Norberto Mazón, jnmazon@dlsi.ua.es Juan Trujillo, jtrujillo@dlsi.ua.es Toronto, 17 th July 2008. Contents. Introduction Current research Requirements for DWs Reconciling with data sources
E N D
Using i* modeling for the multidimensional design of data warehouses Jose-Norberto Mazón, jnmazon@dlsi.ua.es Juan Trujillo, jtrujillo@dlsi.ua.es Toronto, 17th July 2008
Contents • Introduction • Current research • Requirements for DWs • Reconciling with data sources • Deriving logical representations • Conclusions and short term research
Contents • Introduction • Current research • Requirements for DWs • Reconciling with data sources • Deriving logical representations • Conclusions and short term research
IntroductionResearch problem • Data warehouse • Integrated collection of historical data in support of decision making process • Multidimensional (MD) modeling • Fact • Contains interesting measures of a business process • Dimension • Represents context of analysis • Resembles traditional method for database design • Model at conceptual level • Abstracting details related to specific technologies
IntroductionResearch problem - Integrated collection of historical data in support of decision makers OLAP INTERNAL DATA MINING DATAWAREHOUSE ETL CUBES REPORTS DATA SOURCES WHAT-IF ANALYSIS EXTERNAL
IntroductionResearch problem - Integrated collection of historical data in support of decision makers OLAP INTERNAL DATA MINING DATAWAREHOUSE ETL CUBES REPORTS DATA SOURCES DATA SOURCES WHAT-IF ANALYSIS EXTERNAS
IntroductionResearch problem - Integrated collection of historical data in support of decision makers OLAP INTERNAL DATA MINING DATAWAREHOUSE ETL CUBES REPORTS DATA SOURCES DATA SOURCES WHAT-IF ANALYSIS EXTERNAS - Information needs cannot be understood by only analyzing data sources
IntroductionResearch problem - Integrated collection of historical data in support of decision makers OLAP INTERNAL DATA MINING DATAWAREHOUSE ETL CUBES REPORTS DATA SOURCES DATA SOURCES DECISION MAKERS EXTERNAS - Information needs cannot be understood by only analyzing data sources
IntroductionResearch problem - Integrated collection of historical data in support of decision makers OLAP INTERNAL DATA MINING DATAWAREHOUSE ETL CUBES REPORTS DATA SOURCES DATA SOURCES DECISION MAKERS WHAT-IF ANALYSIS WHAT-IF ANALYSIS - Decision making processes must be understood by designers EXTERNAS - Information needs cannot be understood by only analyzing data sources
IntroductionDrawbacks of the state-of-the-art • Only data sources are analyzed to define the conceptual MD model • Incorrect information needs may be modeled • Requirements are specified once the conceptual MD model is defined (even after the deployment of the DW) • Incorrect MD elements may be modeled • Requirements and data sources are not reconciled • Complex ETL processes to populate the DW • Thus, the DW is not viewed as a valuable resource
IntroductionNovelty of our proposal • 1. Explicit requirement analysis stage • Focus on decision making processes • Information requirements • 2. Transformation to a conceptual MD model • Model Driven approach • MD model agrees with decision makers’ expectations • 3. Reconcile requirement model with data sources • MD model agrees with data sources • Completeness • Faithfulness
IntroductionNovelty of our proposal • 1. Explicit requirement analysis stage • Focus on decision making processes • Information requirements • 2. Transformation to a conceptual MD model • Model Driven approach • MD model agrees with decision makers’ expectations • 3. Reconcile requirement model with data sources • MD model agrees with data sources • Completeness • Faithfulness
IntroductionNovelty of our proposal • 1. Explicit requirement analysis stage • Focus on decision making processes • Information requirements • 2. Transformation to a conceptual MD model • Model Driven approach • MD model agrees with decision makers’ expectations • 3. Reconcile requirement model with data sources • MD model satisfies decision makers’ needs • MD model agrees with data sources • Completeness • Faithfulness
IntroductionObjectives of our proposal • Defining a goal-oriented approach for DWs • Based on i* • Model decision processes • Decision makers are concerned about GOALS not directly DATA • Traceability to a conceptual MD model • Align with MDA • Integrate requirements and data sources
MDA • Model Driven Architecture (MDA) • Object Management group (OMG) standard • Using models in software development • Computation Independent Model (CIM) • Platform Independent Model (PIM) • Platform Specific Model (PSM) • Transformations between models • Query/View/Transformation language (QVT) • The code is obtained from PSMs
MDA • Model Driven Architecture (MDA) Describes user requirements Contains information about functionality and structure of the system without taking into account the technology used to implement it Includes information about the specific technology that is used in the implementation of the system on a specific platform Every PSM is transformed into code to be executed, obtaining the final software product.
MDA • Query/View/Transformation language (QVT) • Declarative part of QVT • Transformation set of relations • Relations between metamodels formally defined and automatically performed • Relations applied to models
MDA MODEL 1 Declarative approach of QVT specifies relationships that must hold between candidate models CANDIDATE MODEL DOMAIN R MODEL2 METAMODEL NAME KIND OF RELATION WHEN & WHERE CLAUSES
IntroductionOur proposal [DOLAP 2005] [DaWaK 2006] [DSS 2008] [REBNITA 2005] [RIGIM 2007] [ER 2006] [ER 2007] [DKE 2007]
Contents • Introduction • Current research • Requirements for DWs • Reconciling with data sources • Deriving logical representations • Conclusions and short term research
Requirements for DWs • Goal Oriented Requirement Engineering • DW supports the decision making process to fulfill goals of an organization • Decision makers are concerned about goals • Information requirements are obtained by refining decision makers’ goals • MDA approach • Information requirements must be derived into a conceptual MD model
Requirements for DWs • CIM • Goals and information requirements • PIM • Conceptual MD model • QVT • Transformation between models
Requirements for DWsDefining a CIM • Classification of DW goals • Strategic goals • Change to a better situation • Decision goals • Take appropiate actions • Information goals • Related to required information • Information requirements • Interesting measures of business process • Context of analysis
Requirements for DWsDefining a CIM • i* framework • Modeling goals of decision makers and the required tasks and resources to fulfil them • Several decision makers with different goals • Two extensions of UML • Profile for i* • Profile for adapting i* to the DW domain
Conceptual MD model • UML Profile for MD modeling • Luján, Trujillo, Song. A UML profile for Multidimensional Modeling in Data Warehouses. Data and Knowledge Engineering. 2006. • Class diagram
Reconciling with data sources RECONCILIATION INITIAL PIM USER REQUIREMENTS PIM DATA SOURCES PSM
Reconciling with data sources • The MD conceptual model is reconciled with the available data sources • The DW will be properly populated from data sources • The analysis potential provided by the data sources is captured by the MD conceptual model • Redundancies are avoided • Optional dimension levels are controlled to enable summarizability and to avoid inconsistent queries • Reconciliating process is automatically performed • QVT relations based on Multidimensional Normal Forms • Lechtenbörger and Vossen. Multidimensional normal forms for data warehouse design. Information Systems 28(2003)
Reconciling with data sources +d 1..n <<Rolls-upTo>> +r 1 n_t1=district, n_t2=state
Deriving logical representations • PIM • UML profile for MD modeling [Luján et al. DKE 2006] • PSM • Common Warehouse Metamodel (CWM) • From PIM to each PSM • QVT transformation
Deriving logical representations • Common Warehouse Metamodel (CWM) • Resource layer • Standard to represent the structure of data according to certain technologies • Relational metamodel • Tables, columns, primary keys, and so on • Multidimensional metamodel • Generic data structures • Vendor specific extension • Oracle Express extension
Contents • Introduction • Current research • Requirements for DWs • Reconciling with data sources • Deriving logical representations • Conclusions and short term research
ConclusionsObjectives • DW projects fail in support decision making process • Requirement analysis stage is overlooked for defining a conceptual MD model • Using i* framework together with MDA
ConclusionsScientific contributions • MDA framework • UML profile for i* • Extension for using i* in the DW domain • Transformations to obtain a conceptual MD model • Several kind of logical representations • Multidimensional normal forms • Reconciling data sources and requirements in a hybrid approach • Eclipse-based prototype
ConclusionsRelated work at LUCENTIA research group MDA [DKE 2007 & DSS 2008] Requirements for DWs [RIGiM 2007] CIM UML profile for Data mining [DKE 2007] UML Profile for MD Modeling at DKE 2006 Data sources analysis [ER 2007] PIM Common Warehouse Metamodel Security [DSS 2006 & IS 2007] UML for Physical Modeling at JCIS 2006 PSM
Short term research • Studying unstructured decision processes in-deth to model them in i* diagrams • Taking advantage of every i* feature • Considering complex mechanisms to reason about goals and structure decision processes • Prioritization of goals
Using i* modeling for the multidimensional design of data warehouses Jose-Norberto Mazón, jnmazon@dlsi.ua.es Juan Trujillo, jtrujillo@dlsi.ua.es Toronto, 17th July 2008