380 likes | 401 Views
Learn about manipulating data, validation, and verification techniques. Understand how to ensure data accuracy through validation checks. Discover the benefits of spreadsheets and various data capture methods.
E N D
Manipulating data Data management: validation and verification
Valid data • Data that is valid is allowable • Valid data has to obey certain rules • Data can be incorrect yet still valid
Data can be valid and incorrect Example: • A person has a date of birth 19/12/87 • A user enters it incorrectly as 19/12/78 • Both are valid as dates • Yet one is incorrect
Two techniques for reducing errors Verification Validation
Verification • Checks that errors are not introduced during typing by the user • Checks data entered is the same as on a source document (e.g., order form, application form, etc.)
Two methods of verification • Visual checking/proof reading – checking what has been typed in against a source document • Double entry of data – two people enter the same data – only if both sets of data are the same will it be accepted
Validation checks include Data type checks – is data entered the right type for the field (e.g., letters are not entered into a numeric field)? Presence checks – has a field been left empty? Format checks – is data is of the right length and the right combination of characters for a field (e.g., code FF019J has length 6 characters with first two letters, the next three numbers and the last a letter)? Range Checks Look-up Lists Format Checks Check digits
Parity checks Ensure that data sent over a network has not become corrupted
Hash and Batch Totals Totals created from the data that is meaningless apart from checking that the data is verified. See - Animation
Spreadsheet software Manipulating data
Components of spreadsheets Labels - are used for titles, headings, names, and for identifying columns of data Data - are the values (text or numbers) that you enter into the spreadsheet Formulas - are used to perform calculations on the cell contents
Functions A function is a specialized calculation that the spreadsheet software has memorized Average Max, Min Mode, Median Sum Count, Counta, Countif Vlookup IF
The two types of cell referencing • Relative cell referencing • Absolute cell referencing
Relative cell referencing • This reference tells the spreadsheet that the cell to which it refers is 3 cells up and one cell to the left of cell B4 • If cell B4 is copied to another position, say E5, then the reference will still be to the same number of cells up and to the left so it will now be to cell D2.
Absolute cell referencing • With absolute cell referencing, if cell B4 contains a reference to cell A1, then if the contents of B4 are copied to a new position, the reference will not be adjusted and it will still refer to cell A1
The benefits of using spreadsheets You can perform ‘what if’ investigations – you can make changes to the spreadsheet values to see what happens Automatic recalculation – when an item of data changes, all those cells that are connected to the changed cell by a formula will also change Accurate calculation – provided the formulas are all correct, the calculations on the numbers will always be correct It is easy to produce graphs and charts – once the data has been entered, it is very easy for the spreadsheet to produce graphs and charts based on it
Graphs and Modelling • Pie,Bar,Scattergraphs, line • Modelling • Inputs • Variables • Constants • Contraints • Rules Have a go at the activity 1- 4 on page 86
Manipulating data Data capture
Data capture 1 Is the method by which data from the outside world enters the computer for processing For example, keyboard entry is a method of data capture Other methods include optical mark recognition, the use of sensors, etc.
Data capture 2 The ideal method of data capture would be: • Comparatively accurate • Cheap • Automatic • Fast
Chip and pin • Used to capture credit/ debit card details • Details are encrypted on a chip • User has to enter a PIN that only they know • The PIN is checked and proves the user is authentic
Optical mark recognition (OMR) • Automatically reads marks made on a form • Forms are read/scanned at high speed • Readers are relatively cheap • Reject rate owing to people not filling in the form correctly can be high • If forms are folded they cannot be read
Bar code reading • A code is stored as a series of light and dark bars • Uses a scanner to read the bars • Used on items in stores, parcels, luggage handling at airports, etc. • Fast to read • Accurate • Can be read at a distance
Voice recognition • Uses microphone as the input device • Uses special voice recognition software to turn the sounds into letters that can be understood • Can input text into word-processing, email software, etc., using speech • Can also issue commands using speech
Biometrics • Uses unique features of the human body to recognize a person • Examples include retinal scanning, fingerprint recognition, etc. • Ideal for access control to buildings, rooms and computers • Has the advantage that, unlike passwords, there is nothing to remember
RFID tags • RFID means radio frequency identification • Data is stored on a small computer chip • Tags can be read at a distance • Tags can be read through clothing or a bag • They can store a lot of data • They are quite expensive to manufacture
Manipulating data Databases
A Database • Fields • Key fields / Primary key • Record • File – Table • Flat file and relational databases • Data redundancy • Keeping important data when deleting records • More efficient when doing searches.
Data Handling Applications • Financial Forecasting • Weather Forecasting • Flight Simulators • Expert Systems for decision making
Manipulating data Data handling software
Data handling software Covers any software used to store and manipulate data and output information: • Database software • Spreadsheet software
Updating, deleting and searching records • Update – change the data to bring up-to-date • Delete – remove data no longer needed • Search – look for specific records that match certain criteria (e.g., list the names of all students in Year 11)
Search criteria using operators Operators are used to construct search criteria and include: = equals > greater than < less than <> not equal to >= greater than or equal to <= less than or equal to
Examples of operators being used = Patel (In a surname field finds data for people with surname Patel) = 20 (In a quantity field finds data for all occurrences where the quantity is 20) >01/02/10 (In a date field finds the data for all the dates after (but not including) 01/02/10) <>0 (In a quantity field finds the data where the quantity does not equal zero)
Joining operators Operators can be joined by AND or OR Size = XL AND Type = Shirt The above will find all the extra large shirt details Pet = Dog OR Pet = Cat The above will find details for Dog or Cat
What do these searches do? • Pupil ID = 11809 • Age > 15 • Free meals = Y AND Age > 16 • Form = 11T OR Free meals = Y • Surname = Lee
Answers to check • Displays one record for pupil with pupil ID 11809 (i.e., pupil with surname Hughes) • Displays all the details of pupils whose age is over 15 (i.e., the details of all 16-year-old pupils in this set of data) • Displays data of all pupils who have free meals and are over 15 (i.e., all 16 year olds in this case) • Displays data for pupils who have free meals or who are in Form 11T • Displays the details for pupils with surname Lee