200 likes | 340 Views
IT Applications Theory Slideshows Data Validation By Mark Kelly mark@vceit.com Vceit.com. Types. Manual Electronic. Data Validation. The step taken after data input. Ensures data are reasonable . Cannot check for accuracy of data.
E N D
IT Applications Theory Slideshows Data Validation By Mark Kelly mark@vceit.com Vceit.com
Types • Manual • Electronic
Data Validation • The step taken after data input. • Ensures data are reasonable. • Cannot check for accuracy of data. • Helps prevents processing of bad data to create bad information. • GIGO = garbage in, garbage out
Validation • E.g. a form says a VCE student was born in England. • Validation can tell it’s reasonable - but it still may be complete rubbish. • Another form that says the student was born in 45.67 is clearly not reasonable.
What’s reasonable? • The main validation checks: • Type check • Range check • Existence check (or presence check)
Type check • A value should be of the right type • Names should be text • Dates should be dd/mm/yyyy format • Number of pets should be numeric • A photo field should contain a JPG image rather than a spreadsheet • A credit card number field rejects the letter “O” typed instead of a zero
Range check • Data should be within a certain range • Kindergarten students’ ages should be between 3 and 6 • Number of pets should be >= 0 • Data must exist in a limited list of options • State of residence must be in (Vic, SA, Qld, WA, ACT, Tas, NT). Sex must be M or F.
Range check • Data must be unique • e.g. a phone number, customer ID • Data must be of a minimum, maximum or fixed length • e.g. credit card number has 16 digits • Data must have a given format • E.g. user ID must have 3 letters and 2 digits • Date must be in dd-mm-yyyy format
Existence check • Is the data present? • In a pizza delivery list, leaving off the customer’s name may not be vital, but leaving off the address is. • Existence check can reject records that lack vital data.
Warning! • Don’t make validation so strict that valid data are rejected. • Data can be unexpected but accurate! • kindergarten child could be 8 years old in rare circumstances • mandating the entry of a phone number would wrongly reject people with no phone • Validation should reject absurd data, not slightly odd data!
Bad validation • American websites that insist on a 5 digit zip code (postcode) – even if you’re not American! • Databases that reject addresses with no Street/ Avenue/Road/Lane (etc) value: some addresses don’t have them. • Testing for existence in a limited list when the list is not actually limited. (e.g. Title must be one of:Mr, Miss, Mrs, Dr, Sir, Madam’)
Manual & Electronic • Some data needs a human’s common sense to detect unreasonable entries • Spelling of names • Entry of Adam Sandler in a database of talented actors • “Jane Smith” registered as a male • Other checks are better done electronically • Faster, more accurate to get a database to check for missing ID number values
Some validation techniques Check digits • Used for numerical data. An extra digit is added to a number which is calculated from the digits. • The computer checks this calculation when data are entered, e.g., The ISBN for a book. The last digit is a check digit calculated using a modulus 11 method. • Detects mis-typing.
Some validation techniques Spelling and grammar check • Looks for spelling and grammatical errors. Consistency Checks Checks fields to ensure data in these fields corresponds, e.g., If Title = "Mr.", then Gender = "M".
To prevent invalid data entry Drop-down menus or lists • Force data to be entered from a pre-made selection • States of Australia • Months of the year
Helping users • Give examples or advice about expected data formats… Enter your 10-character regocode (case sensitive) Date (dd/mm/yyyy) __ __ __ __ - __ __ __ __ - __ __ Note how this form clearly indicates which fields are required and which are optional
Calendar Control • Date cannot be invalid since it must be chosen from a set of valid dates • Can still be inaccurate however!
Calendar control helps users • User does not have to guess correct date format expected • 23/08/09? • 23-08-2009? • 23 Aug 09? • 08/23/09 (US format)? • 092308 (Asian format)?
Calendar control helps organisations • Date data is not ambiguous • Does 02/05/06 entered mean: • 2 May 2006? • 5 Feb 2006? • 6 May 2002? A bad date could be prevented with good validation
Thanks! • mark@vceit.com • Vceit.com