1 / 16

Good Design for Historical source based Databases

Good Design for Historical source based Databases. Hamish James. H istory D ata S ervice. D atabases. A database is a computerised record keeping system.

palazzo
Download Presentation

Good Design for Historical source based Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Good Design for Historicalsource basedDatabases Hamish James History Data Service History Data Service

  2. Databases • A database is a computerised record keeping system. • A DataBase Management System (DBMS) is a computer application built around a database that provides a flexible way of storing, manipulating, and examining data. • A DBMS consists of data, hardware, software, and users • A DBMS on a personal computer will provide facilities for: • inputting data, modifying, retrieving and deleting data • querying the data (SQL) • producing reports based on the data • building ‘front-ends’ for users History Data Service

  3. Data Models • Data models are abstract definitions of structures and relationships used to organise data in a database. • Data models can be characterised by how they organise the connections between different records: • flat file • hierarchical • network • relational • object orientated • Most DBMS’s available for personal computers are either flat file or relational. History Data Service

  4. Entity Relationship Modelling • A data modelling technique that transforms information into a form that meets the requirements of the relational data model. • Entities are the things that the database will contain a representation of. • Entities can be anything; people, places, events, physical objects, or concepts. • All the entities with the same characteristics can be collectively called an entity type. • Relationships describe the way entities are connected to each other. History Data Service

  5. Relationships • one to one relationships connect one entity to one other entity. • one to many relationships connect one entity to one or more other entities. • many to many relationships connect many entities to many other entities. History Data Service

  6. Data • The field is the basic unit of data in a database. A field stores a single piece of information of a particular data type. • Fields are combined to form records. A record matches an entity. • A set of records with the same fields are collected together in a table History Data Service

  7. Historical Uses for a Database • To store and organise large amounts of information automatically. • To provide easy access to the information contained in the original source. • An environment for manipulating (changing and adjusting) the source. • To search/filter/summarise complex information quickly. History Data Service

  8. Historical Database Example History Data Service

  9. Historical Databases • Technical decisions are often the least important. • Historians work with information they do not control. • incomplete, poorly structured information of varying quality. • A historical source based database is a representation of the primary source, but it is not an exact replica of the primary resource. • Some information may be left out. • some extra information may be included. • A historical source based database mixes elements of a primary source with elements of a secondary source. History Data Service

  10. Interpretation Layer • incorporates researcher’s knowledge and judgement. • Links records and forms aggregates. • Standardisation Layer • provides a foundation for analysing the data. • codes and standardisation rules are applied. • Source Layer • an accurate digital representation of the source. • defines level of detail captured. The Three Layer Model History Data Service

  11. Three Layer Design Examples History Data Service

  12. Simple Design Hints • Make sure the smallest unit of data matches the smallest unit of analysis. • If you want to look at people by last name then have separate first and last name fields, not just a name field. • Don’t mix data types • separate numbers and words. • Document everything you, either in the database or with the database. • Data entry, data standardisation and coding, data transformations, limits of data etc. • Keep information that tracks the origin and history of the database. • Add information, don’t delete information. History Data Service

  13. Further Information Starting Out Michael J. Hernandez, Database Design for Mere Mortals : A Hands-On Guide to Relational Database Design, Addison-Wesley, 1997. Database Central, http://databasecentral.com/ History Data Service, http://hds.essex.ac.uk/ The ‘Classics’ Charles Harvey & Jon Press, Databases in Historical Research, Macmillan Press, 1996. C. J. Date, An Introduction to Database Systems, Addison-Wesley, 1999 (7th ed.) History Data Service

  14. Source Layer • Acts as the reference version of the original source. • An accurate representation of the source, including errors, omissions etc. • Contents determine the highest level of detail available about the source in the database. • Includes a reference to the non-digital original source. • Includes a unique identifier for each item. • Implementation: • as long text fields containing full text transcriptions. • as ‘blob’ fields containing scanned images. • as a regular database table. • as a pivoted database table. History Data Service

  15. A series of rules that are applied to data to ensure that it conforms to the relational data model: • 1 remove repeating groups (first normal form). • 2 remove partial dependencies (second normal form). • 3 remove indirect dependencies (third normal form). Standardisation Layer • Organises the information into discrete units with fully defined contents. • Separates information in the source into separate fields according to data type and data content. • Simplifies the data by standardising and coding it. • Normalises the data. • Includes links back to the source layer. • Implementation: • Possibly as addition columns in source layer tables. • Probably as separate tables with, ideally, a one-to-one relationship to records in the source layer. History Data Service

  16. Many-to-many relationships are usually converted into two one-to-many relationships to remove data redundancy. Interpretation Layer • Creates historical entities from the data and the knowledge and expertise of the historian. • Incorporates interpolations and extrapolations from the data in the standardisation layer. • Selectively includes and excludes information from the standardisation layer. • Links separate records to form entities such as ‘individuals’ or ‘households’. • Many-to-many relationship with records in the standardisation layer. History Data Service

More Related