380 likes | 487 Views
Data and its manifestations. Storage and Retrieval techniques. What is Data . Numbers Text Sentences Files Images Audio files. Excel File. One way to store data Columns and Rows of data can easily be entered Disadvantages Difficult to look for data Security
E N D
Data and its manifestations. Storage and Retrieval techniques.
What is Data Numbers Text Sentences Files Images Audio files
Excel File One way to store data Columns and Rows of data can easily be entered Disadvantages Difficult to look for data Security Multiple files are not related to each other
Excel File Data Redundancy Data Inconsistency
Hierarchy of Data Bit Byte Field Record File Database
What are Keys Primary Keys Secondary Keys (Alternate Keys) Foreign Keys (will understand better with reference to a database)
Serial and Sequential Files Master Files permanent source, data of a permanent nature, data which will change every day Transaction Files used to update a Master, batch processing
Types of File Organization Serial Sequential Indexed Sequential Direct Access (random)
Types of File Organization SERIAL Just add records as they come in. Used for Transaction files. Discuss why ?
Types of File Organization SEQUENTIAL Add records one after another but in key sequence Used for master files Discuss why ?
Types of File Organization Direct Access Files Store the record at an address which is calculated using a reference to the Primary Key
Algorithms Add a record to a Serial File Open file Append record to end of file
Algorithms Add a record to a Sequential File • Open old file for reading • Open new file for writing • Start from beginning of old file • Repeat • Read next record • If current record key > new record key • write new record to file • End if • Write current record to new file Until EOF • If new record is not yet inserted then write new record to new file.
Algorithms Delete a record from a Serial or Sequential file • Open old file for reading • Open new file for writing • Repeat (read from old file) • Read next record • If current record key <> key of record to be deleted • then write record to the new file • End if Until End Of File
Algorithms Search for a record with a particular key Serial File Open File Repeat (start reading) Test for match Until EOF or match is made
Algorithms Search for a record with a particular key Sequential File Open File Repeat (start reading) Test for match Until match is found or key of this record > key of wanted record Note : Here once the key passes the key of the wanted record the record can be deemed as not found. Because the records are sorted sequentially
Logic Update Sequential Master file with Transaction records Open a new file and add all records in Seq file to new file until the first sequential transaction record comes up. Now write the transaction record into the new file. Continue the process and write all other records from sequential file and transaction file.
Algorithms Update a Sequential Master File Open master file for reading Open transaction file for reading Open new master file for writing Repeat (transaction file records) While master record key < transaction record key Write master record to new master file End While (Read next master record) Write transaction record to new master file Until EOF (transaction) Repeat (master file records) Write master record t new master file Until EOF (master)
Direct Access Filehow records are stored Also called Hash, Random or Relative files. One hash algorithm could be: Every record has a key. Take the key and divide by total number of records. The remainder is the address where I will store the record.
Direct Access Filemanaging a collision This can cause synonyms or collisions. One way to resolve a collision is if there is one, store the record at the next available memory address. When highest address is reached, wraparound and store at address 0.
Direct Access Filemanaging a collision Another method is have a separate area to store these “collision affected” records. Mark the new address at the original address location.
What kind of Files to use and When? Should retrievals be fast ? Should information be upto date or not necessary ? Can information be batched ? Are reports needed to be in order ? What happens when information is lost or destroyed ?
Hit Rate It is the proportion of records being accessed in any one run. It is calculated by dividing the number of records accessed by the total number of records on file expressed as a percentage. If hit rate is low, direct access is better. If high sequential is ok. Payroll processing has high hit rates, Updating address has low hit rate.
Data Security Data Security is keeping data safe from the various hazards to which it may be subjected. Protection against loss, corruption, or unauthorized access to data.
How to keep data secure Use of passwords Immediate removal of employees who have been handed the pink slip/sacked. Educating staff on ways data can be breached. Separation of duties and having different access levels. Appointing a security manager.
User Ids and Passwords Keep passwords and user ids in a safe place – database tables. Keep passwords encrypted. Passwords should not be displayed on screens or on printouts. They should be suppressed.
Encrypting data Data encryption is done so that data transmitted to remote locations is secure from hackers and wire tappers. There is no limit to damage that can occur should tapping happen and security of data is hampered in any way or form. There are many encryption algorithms available including use of encryption keys.
Access Rights What do you mean by Access Rights ---Right to see some or all information Access Rights is implemented by having a leveled structure in security where people of a certain level can see certain data/even certain fields.
Backups Needed to prevent loss of data due to a disaster Protects against power failures, theft, viruses Backup recovery should be properly tested before implementation Sometimes replication is implemented in an organization to keep backups up to date Backups taken on disks are transferred to remote locations to prevent major disasters
Archiving The difference between archiving and backing up should be clear. What is Archiving ?
Data Representation A binary digit (1 or 0) is known as a bit. 8 bits make up a byte. One character can be represented as one byte.
Denary to Binary number conversion How do I represent 102 in decimal as a binary 64 32 16 8 4 2 1 Put in a 1 where possible and rest as zeroes starting from right 64 32 16 8 4 2 1 1 1 0 0 1 1 0
Binary to Denary Consider 1 1 0 0 1 1 0 Start from right and represent each digit as 2,4,8 and so on Multiply place position with 1 or 0 as case maybe and add the numbers together
Data and Information Raw data is a collection of numbers and characters stored in a particular way so as to be able to read it later. Information is what can be derived from the stored data. A communication that provides understandable and useful knowledge to the recipient.
What is BCD Binary Coded Decimal 4 bit representation of a decimal digit Eg : 20 in BCD would be 0010 0000 Advantage : Easier to convert. Just split into groups of 4 and convert to decimal. In BCD arithmetic rounding of fractions does not occur. In normal binary arithmetic some kind of rounding off occurs.
Disadvantages of BCD More bits are required to store a number Calculations with this is more complex than ordinary binary. Consider adding 1 and 19 0000 0001 0001 1001 0001 1010 is not correct. 1010 is not a valid BCD.
Disadvantage of BCD This problem occurs because 9 is represented as 1001 after which the next 6 binary numbers are unused. So we need to add 6 to this result. 0001 1010 0000 0110 0010 0000 which is 20 which is the correct result