210 likes | 343 Views
Enterprise Business Processes and Re porting ( IS 6214 ). MBS MIMAS `17 th Feb 2010. Fergal Carton Bu siness Information Systems. Last week. Feedback on presentations Data cubes Types of data Dynamic / Static Cucina example Soft information Data recording. This week. Data storage
E N D
Enterprise Business Processes and Reporting(IS 6214) MBS MIMAS `17th Feb 2010 Fergal Carton Business Information Systems
Last week • Feedback on presentations • Data cubes • Types of data • Dynamic / Static • Cucina example • Soft information • Data recording
This week • Data storage • Deciding what information to collect • Exploiting data warehouses • Cucina and real time information • Extract, transform, load (ETL) • Real time data • Refresh rates and response times
Data storage • Series of ad-hoc systems manual and computer-based (spreadsheet, filed forms…) • Dedicated databases for manufacturing data (QC, shipping etc…) • Process Control Systems (technical parameters) • Other specialised proprietary systems (integration may not be easy). • ERP system with its own data structure or fed by existing systems
Deciding what information to collect • Information cost + overload mean not all data are useful • Some framework can be used – e.g. Critical Success Factors (CSF) • Questions that must be answered: • How is it measured and broken down? • How often should it be measured? • Who should know about it? • Where can the data be found? • How should it be presented?
Exploiting the DW data Static Reporting Data Staging Area Source Systems Scrutinising Multidimensional Data Warehouse Data Cubes OLAP tools Relational Database on a dedicated Server Extraction Cleaning Transformation Loading De normalised, data Discovering Data Mining …….
Think about real-time for Cucina • What information is required real time? • Can you differentiate between report types • Static • Scrutinising • Discovery
ETL Tools • Extraction, Transformation, and Loading • Specification based • Eliminate custom coding • Third party and DBMS based tools
Data extraction and transformation • Getting data out of legacy applications • Cleaning up the data • Enriching it with new data • Converting it to a form suitable for upload • Staging areas
Data Quality Problems • Multiple identifiers • Multiple field names • Different units • Missing values • Orphaned values • Multipurpose fields • Conflicting data • Different update times
Example 1 – the supplier file New supplier code to include city where firm is based Assignation of category based on amounts purchased OLD Sup code Sup name Sup address City Phone 4 digits NEW Sup code Sup name Sup address… Phone Cat 3 letters + 1,2,3 depending 4 digits on total purchases last year
Example 2: merging files • Complete customer file based on Accounts and Sales and Shipping OLD (finance) CustID name address city account number credit limit balance OLD (sales) CustID* name address city discount rates sales_to_date rep_name OLD (Shipping) CustID** name address city Preferred haulier
Refreshing databases • Timing • Criticality of information • Volume of data • Response time • Real-time requirement • Level of aggregation / granularity
Warehouse Database Life cycle of the DW First time load Operational Databases Refresh Refresh Purge or Archive Refresh
Real time information • Up to date • On-line • Actual data • Live feed • Decisions made on what basis?
Real time requirement? • Historical sales or accounting data, not real-time • Sales as quarter end approaches • Inventory levels for MRP • Exchange rates, when is Visa rate calculated? • Real-time processing: card transactions down
Response times • Response times are a function of : • response time, • Infrastructure elements, • Database sizing • Transaction processing • Interfaces • Reporting • Other processing demands • Peak times • …
Example • Revenue reports from EMC Data warehouse • Report can grow to >1million lines at quarter end • Should not be run on ERP server • Poorly designed?
Manager’s view • Volume has been increasing at a huge pace compared to … like, you go talk to Jonathan, … my answer to it will be, get used to it, it’s not going to go away, I don’t care what you do, it’s not my problem, I want the reports, you deal with the volume of records, it’s not going to go away, you deal with it.
Determining the Refresh Frequency • Maximize net refresh benefit • Value of data timeliness • Cost of refresh • Satisfy data warehouse and source system constraints