160 likes | 171 Views
Good Design for Historical source based Databases. Hamish James. H istory D ata S ervice. D atabases. A database is a computerised record keeping system.
E N D
Good Design for Historicalsource basedDatabases Hamish James History Data Service History Data Service
Databases • A database is a computerised record keeping system. • A DataBase Management System (DBMS) is a computer application built around a database that provides a flexible way of storing, manipulating, and examining data. • A DBMS consists of data, hardware, software, and users • A DBMS on a personal computer will provide facilities for: • inputting data, modifying, retrieving and deleting data • querying the data (SQL) • producing reports based on the data • building ‘front-ends’ for users History Data Service
Data Models • Data models are abstract definitions of structures and relationships used to organise data in a database. • Data models can be characterised by how they organise the connections between different records: • flat file • hierarchical • network • relational • object orientated • Most DBMS’s available for personal computers are either flat file or relational. History Data Service
Entity Relationship Modelling • A data modelling technique that transforms information into a form that meets the requirements of the relational data model. • Entities are the things that the database will contain a representation of. • Entities can be anything; people, places, events, physical objects, or concepts. • All the entities with the same characteristics can be collectively called an entity type. • Relationships describe the way entities are connected to each other. History Data Service
Relationships • one to one relationships connect one entity to one other entity. • one to many relationships connect one entity to one or more other entities. • many to many relationships connect many entities to many other entities. History Data Service
Data • The field is the basic unit of data in a database. A field stores a single piece of information of a particular data type. • Fields are combined to form records. A record matches an entity. • A set of records with the same fields are collected together in a table History Data Service
Historical Uses for a Database • To store and organise large amounts of information automatically. • To provide easy access to the information contained in the original source. • An environment for manipulating (changing and adjusting) the source. • To search/filter/summarise complex information quickly. History Data Service
Historical Database Example History Data Service
Historical Databases • Technical decisions are often the least important. • Historians work with information they do not control. • incomplete, poorly structured information of varying quality. • A historical source based database is a representation of the primary source, but it is not an exact replica of the primary resource. • Some information may be left out. • some extra information may be included. • A historical source based database mixes elements of a primary source with elements of a secondary source. History Data Service
Interpretation Layer • incorporates researcher’s knowledge and judgement. • Links records and forms aggregates. • Standardisation Layer • provides a foundation for analysing the data. • codes and standardisation rules are applied. • Source Layer • an accurate digital representation of the source. • defines level of detail captured. The Three Layer Model History Data Service
Three Layer Design Examples History Data Service
Simple Design Hints • Make sure the smallest unit of data matches the smallest unit of analysis. • If you want to look at people by last name then have separate first and last name fields, not just a name field. • Don’t mix data types • separate numbers and words. • Document everything you, either in the database or with the database. • Data entry, data standardisation and coding, data transformations, limits of data etc. • Keep information that tracks the origin and history of the database. • Add information, don’t delete information. History Data Service
Further Information Starting Out Michael J. Hernandez, Database Design for Mere Mortals : A Hands-On Guide to Relational Database Design, Addison-Wesley, 1997. Database Central, http://databasecentral.com/ History Data Service, http://hds.essex.ac.uk/ The ‘Classics’ Charles Harvey & Jon Press, Databases in Historical Research, Macmillan Press, 1996. C. J. Date, An Introduction to Database Systems, Addison-Wesley, 1999 (7th ed.) History Data Service
Source Layer • Acts as the reference version of the original source. • An accurate representation of the source, including errors, omissions etc. • Contents determine the highest level of detail available about the source in the database. • Includes a reference to the non-digital original source. • Includes a unique identifier for each item. • Implementation: • as long text fields containing full text transcriptions. • as ‘blob’ fields containing scanned images. • as a regular database table. • as a pivoted database table. History Data Service
A series of rules that are applied to data to ensure that it conforms to the relational data model: • 1 remove repeating groups (first normal form). • 2 remove partial dependencies (second normal form). • 3 remove indirect dependencies (third normal form). Standardisation Layer • Organises the information into discrete units with fully defined contents. • Separates information in the source into separate fields according to data type and data content. • Simplifies the data by standardising and coding it. • Normalises the data. • Includes links back to the source layer. • Implementation: • Possibly as addition columns in source layer tables. • Probably as separate tables with, ideally, a one-to-one relationship to records in the source layer. History Data Service
Many-to-many relationships are usually converted into two one-to-many relationships to remove data redundancy. Interpretation Layer • Creates historical entities from the data and the knowledge and expertise of the historian. • Incorporates interpolations and extrapolations from the data in the standardisation layer. • Selectively includes and excludes information from the standardisation layer. • Links separate records to form entities such as ‘individuals’ or ‘households’. • Many-to-many relationship with records in the standardisation layer. History Data Service