260 likes | 378 Views
What search criteria would I use to search for James in 1881?. Name? Yes, allowing for spelling variations Age? Yes, allowing for some variance Occupation? No. Often changed over the decade Place of Birth? Yes and No. Information in 1881 censuses varies
E N D
What search criteria would I use to search for James in 1881? Name? Yes, allowing for spelling variations Age?Yes, allowing for some variance Occupation? No. Often changed over the decade Place of Birth? Yes and No. Information in 1881 censuses varies Marital status? No. Often changed over the decade Gender? Sure. It could be mis-coded, but is likely correct.
Forget diamonds or dogs… • MySQL: a true best friend http://www.mysql.com/
1NF: What are our entities? Family units? Works for family of 4, but what about the Eunson family with 8 in the family? We would have to add new columns, so no, Family units is no good as an entity.
1NF: What are our entities? Individuals? Yes, individuals works. So, let’s give each individual a Primary Key
1NF • Primary Key (PK)
1NF: 3 Goals Goal 1: no duplicated rows. Each row should have a PK. Goal 2: Each cell contains only one value. No groups or comma-separated lists Goal 3: Any given column contains the same kind of data Goal 1: no duplicated rows. Each row should have a PK. Goal 2: Each cell contains only one value. No groups or comma-separated lists Goal 3: Any given column contains the same kind of data
1NF: New Entities, New Tables • Primary Key (PK)
1NF: 3 Goals for the new tables Goal 1: no duplicated rows. Each row should have a PK. Goal 2: Each cell contains only one value. No groups or comma-separated lists Goal 3: Any given column contains the same kind of data Goal 1: no duplicated rows. Each row should have a PK. Goal 2: Each cell contains only one value. No groups or comma-separated lists Goal 3: Any given column contains the same kind of data Goal 1: no duplicated rows. Each row should have a PK. Goal 2: Each cell contains only one value. No groups or comma-separated lists Goal 3: Any given column contains the same kind of data Birthplace fails at Goal 2 and Goal 3 holding parish, county and country data in a comma-separated list
Primary Key (PK) • Junction Table • Foreign Key (FK)
2NF: “Sub-entities” • Repeated bits of information should now be moved into their own tables. Information repeated in Relate to HH, M. Stat and Gender columns = new tables.
3NF • If we delete a row, will we lose any data that other records might need? • If we deleted George Struther’s row, will we lose the information that the code for New Zealand is 5? • No, because it is entered into the Country table – the code in the Individual table is a Foreign Key
3NF • 2. If we add a row, could we accidentally make any data entry mistakes that would compromise our database’s integrity? • If we accidentally entered incorrect information into a row, would that impact any other part of the database? • No, because even if we code Dr Ross as a girl, the gender table still holds the accurate code for male.
3NF • 3. Have we stored any values that we should calculate instead? • In many circumstances we would calculate age because it is dynamic. • In the case of historic data like the census the recorded age is all we have, and it could have been mis-recorded, so we do not calculate the age in this case.
One other thing • Separate names into separate columns • If, for example, you want to search for the density of a given surname across all of Scotland, you will have trouble doing so unless surnames are separated out from first names.
Soldiers from CrikeyVillage, Wellington County, England who were wounded while in service in the Western Desert Campaign with the 7th Armoured Division