200 likes | 862 Views
Data Warehousing: A Definition. “A data warehouse is a single integrated store of data which provides the infrastructural basis for informational applications in the enterprise” Kelly, “Data Warehousing”, p55 Will be a collection of tools - not just a database
E N D
Data Warehousing: A Definition • “A data warehouse is a single integrated store of data which provides the infrastructural basis for informational applications in the enterprise” Kelly, “Data Warehousing”, p55 • Will be a collection of tools - not just a database • query system to support decision making • Ability to trawl (mine) in a way that suits the user • integrated store of consistent, up to date data • Main aim to maximise the effective use of data within a business. • Create a bridge between disparate systems
Old Rationale for IS • Automate the business process • reduce costs • Information seen as a fortuitous by-product • Emphasis on IS as a input/processing/output • define functions • specify what system does • difficult to change outputs
Problems With a Data Processing Approach • Data problems • resident in different systems • inconsistent • different attributes • different time bands/frames • irreconcilable • Deletion vs. archiving of data • inappropriate and inconsistent data models
Issues in Moving to Data Warehousing Philosophy • Ownership issues • Planning across department/functional areas • Economic issues • Standardisation of data model • similar data entities in multiple systems • data inextricably bound with application • Standards in application design • piecemeal approaches to system implementation • Multiple interpretations of reality
The Business Case for Data Warehousing • Reduce costs (?) • Increase business e.g. through better understanding of customers • Develop competitive advantage • Change the nature of the business
Advantages of Data Warehousing • Removes query and reporting load from TPS • Allows more appropriate technology for queries and reports • Provides a more simple query interface to users • May provide a higher integrity DB than a TPS • Easy way of reporting across multiple systems • May provide a DB with a longer memory than a TPS! • May make TPS more secure by reducing access
Disadvantages of Data Warehousing • May rely too heavily on data generated only from TPS • May complicate business processes by “institutionalising” reports, data for data’s sake • Learning curve too long - technical and business aspects • Culture of developing quick and dirty strategic applications • End-users may not have skills for building queries • Availability of data warehousing skills • Data warehouses require high maintenance • Cost of information may outweigh its benefit
Problems to Consider • Extracting, cleaning and loading of data may be time consuming • Undetected error in systems feeding the warehouse • Warehouse project may highlight unrecorded data in existing systems • End user training in query and reporting tools may increase requests for IS written reports • End user approaches to calculations may differ due to different business views • Creation of a large-scale data warehouse may homogenise data - reducing content • Conflict between “need to know” and “right to know” mindsets
Data Warehousing Tools • Meta-data modelling • Data transformation • Extract • Cleanse • Load • Database (relational, parallel) • Query language
User Application Information Retrieval Application Application Importing Process Data Warehouse Periodic transformation And integration process TPS 1 TPS 2 TPS 3 Relationship Between Data Warehouse and Other Systems
Some Political Issues (IS) • Who should a data warehousing development group report to? • Who should administrate over the warehouse? (DBS or development group) • How should the support of feeder system developers be gained? What about errors in the feeder system? • Who has responsibility for data quality monitoring? • What about changes to the feeder systems?
Some Political Issues (User-IS) • Why should users give up control of user managed databases? • How is the co-operation gained of a user who’s spreadsheet is being automated? • Should design be for the needs of the masses or the most demanding users? • How many data marts should there be?
Some Political Issues (User-User) • Who has access to what data? • Do all users define and interpret data the same way? • Who has the final say about the “correctness” of data
General Political Issues • Imposes new obligations whose responsibilities are unclear. • May require a change in processes that an organisation may not be comfortable with. • Requires agreement on some and not all definitions of data
Some Useful References... • Http://pwp.starnetinc.com/larryg • Kelly, (1998) “Data Warehousing”, Wiley