1 / 56

Construction of a database Per Weidenman PAR AB

Construction of a database Per Weidenman PAR AB. Database A collection of data It belongs together It models the ”world” Database management system (DBMS) The database (a collection of interrelated data) Software to manage and access the data. User: Searching Reporting.

teige
Download Presentation

Construction of a database Per Weidenman PAR AB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Construction of a database Per Weidenman PAR AB

  2. Database • A collection of data • It belongs together • It models the ”world” • Database management system (DBMS) • The database (a collection of interrelated data) • Software to manage and access the data

  3. User: • Searching • Reporting DBMS requirements DBMS Organised data Input: transactions Statistical analysis • ”Database” • Data Warehouse • etc.

  4. Database management systems (DBMS) • Microsoft Access • Microsoft SQL Server • DB2 • Oracle • MySQL • FirebirdSQL • etc. • SQL – Structured Query Language • A computer language to define and • search data

  5. Relational databases Tables containing data, organised in rows and columns Keys, used for linking data in different tables

  6. Example Simple database for collecting and organising statistical papers Created in Microsoft Access

  7. Paper name and details Link to dokument (pdf file) Autors

  8. A database with four tables Keys

  9. One of the tables, containing paper name and details Key Rows containing paper name and other details One paper on each row

  10. The keys are used to link data in the four tables

  11. Table ”artiklar” Table ”författare2” Table ”personer2” Key: artikel_id Key: artikel_id Key: person_id Key: person_id 1 2 3 4 5 6 7 8 9 1 1 1 … 5 6 1 2 3 … 4 4 1 2 3 4 Aaaa Bbbb Cccc Dddd One paper having 3 autors One person being the autor of 2 papers

  12. A query: the result of asking the database about papers and autors One paper and the corresponding 3 autors One autor and the corresponding 2 papers

  13. ”Business” users • User: • Searching • Reporting DBMS requirements DBMS Organised data Input: transactions Statistical analysis • ”Database” • Data Warehouse • etc. IT Department

  14. DBMS requirements from a statistical / analytical viewpoint • Data quality • Data types • Performance • Maximun information • Historical data • Regulation and secrecy

  15. DBMS requirements from a statistical / analytical viewpoint Data quality • User: • Searching • Reporting DBMS Sales System X Sales System X Input: transactions Enter customer name: Choose customer name: … use, ifpossible, selection from a list of valid values Instead of entering text/data by typing… Volvo Personvagnar AB Volvo Lastvagnar AB Volvo Construction AB Volvo Bussar AB Volvo Logistics AB …

  16. DBMS requirements from a statistical / analytical viewpoint • Wedontwant: • Negative values • 40+ • 1982 Data quality • User: • Searching • Reporting Definerules for valid input (values, intervals, etc.) DBMS Sales System X Input: transactions Enter customer age:

  17. DBMS requirements from a statistical / analytical viewpoint Data quality Missingvaluesshould stored as ”null” in the database. Not as 0 (digitzero) • User: • Searching • Reporting Handling of missingvalues … DBMS Input: transactions

  18. DBMS requirements from a statistical / analytical viewpoint Data types Text Numeric

  19. DBMS requirements from a statistical / analytical viewpoint Performance Searching for individualrecords Creating ”prepared” reports by counting or summing • User: • Searching • Reporting DBMS requirements DBMS Organised data Input: transactions Statistical analysis • Large datasets • Multivariatemethods • Iterative estimation • Etc.

  20. DBMS requirements from a statistical / analytical viewpoint Weneed to report on age groups: 20-29 30-39 40-49 … Maximum information • User: • Searching • Reporting 34 DBMS Sales System X Input: transactions Thuswe store age as an interval, not as a value! Enter customer age: The fallacy of beeing too user oriented!

  21. DBMS requirements from a statistical / analytical viewpoint … will be added to table Orders and stored as a ”new row” Historical data Table: Orders Customer ID Order date Order value • User: • Searching • Reporting DBMS Input: transactions Sales System X Each new order for a specificcustomer … Customer name: Customer address: Order value: Order date:

  22. DBMS requirements from a statistical / analytical viewpoint Historical data … willprobably UPDATE the existingrecord (row) for the specificcustomer Table: Customers • User: • Searching • Reporting Customer ID Customer name Customer address DBMS But a new address … Input: transactions Sales System X Thus, the old value of ”customer address” will be deleted and replaced with the new value. Customer name: Customer address: But this will do fine for users focusing on searching / reporting! Order value: Order date:

  23. DBMS requirements from a statistical / analytical viewpoint Historical data Each time a value is UPDATED for a certaincustomer … Table: Customers • User: • Searching • Reporting Create av new table to contain historic records Table: Customers_history Customer ID Customer name Customer address DBMS Customer ID Customer name Customer address From To Input: transactions … the complete (previous) record is transfered to the table Customers_history

  24. DBMS requirements from a statistical / analytical viewpoint Historical data Table: Customers • User: • Searching • Reporting This structure will make analysis of processes possible But not easy! Table: Customers_history Customer ID Customer name Customer address DBMS Customer ID Customer name Customer address From To Input: transactions

  25. DBMS requirements from a statistical / analytical viewpoint Regulation and sectrecy

  26. DBMS requirements from a statistical / analytical viewpoint Current data Current + historical data Operating on individual records Operating on many records Next on this channel…

  27. User: • Searching • Reporting DBMS requirements DBMS Organised data Input: transactions Statistical analysis A database containing historic transactions

  28. PAR / Bisnode database Tables Basic company data One record per company. Contains name, address, startdate, enddate, line of business, etc. FTG Sampling for times series statistics Historic company data Many records per company. Contains the accumulated historic records from table FTG FTG_H Serrano Statistical analysis Board data Balance sheet data One record per annual report (thus many records per company). Turnover, profit, key ratios, etc. BOKSLUT Historicnames etc. How? Board member data Many records per company and person. FUNKTION_PERIOD And many more tables!

  29. END

  30. Basic company data One record per company. Contains name, address, startdate, enddate, line of business, etc.

  31. Historic company data Many records per company. Contains the accumulated historic records from table FTG

  32. Balance sheet data One record per annual report (thus many records per company). Turnover, profit, key ratios, etc.

  33. Board member data Many records per company and person.

  34. Serrano Balance sheet data from different periods transformed to yearly data records

  35. Serrano Historic transactions from FTG_H transformed to yearly data records

  36. Serrano Board Data Balance member data from any mix of startdate, enddata and period length transformed to yearly data records

  37. Summing up register data to annual figures A • Exampel. • Register containingbalancesheet data: • Number of employes • Turnover • Profit • Tangible assets • Etc. ÅR Nu 3 2 1

  38. Summing up register data to annual figures A B ÅR Nu 3 2 1 Brutet räkenskapsår

  39. Summing up register data to annual figures A B C ÅR Nu 3 2 1 Omlagda räkenskapsår

  40. Summing up register data to annual figures A B C D ÅR Nu 3 2 1 Missing data

  41. Summing up register data to annual figures B ÅR Förslag: Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden … Nu 3 2 1

  42. Summing up register data to annual figures B ÅR Förslag: … och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde. Nu 3 2 1 Förslag: … samt imputera för full täckning under sista året

  43. Summing up register data to annual figures B ÅR Nu 3 2 1 Database

  44. First exampel Register based transport statistics for SIKA: Decreased response burden Increased understanding of the transporting companies (as a complement to the ”usual” fokus on type of goods) Time series describing economic status and change .

  45. Objective: Describing economic status and change in transporting companies during the last ten years. Total number of employes and turnover …

  46. Objective: Describing economic status and change in transporting companies during the last ten years. … or turnover growth compared to BNP

  47. Objective: Describing economic status and change in transporting companies during the last ten years. … or profit development for different types of freight companies

  48. Objective: Describing economic status and change in transporting companies during the last ten years. … or the number of employes in a cohort of new companies.

  49. Tables based on balance sheet data from each company

More Related