230 likes | 337 Views
Data File Hierarchy/Terminology. Folder or Directory A container for 0 or more files accessible from the same location. File A collection of related records stored as a unit on external media. Record or Row
E N D
Data File Hierarchy/Terminology • Folder or Directory • A container for 0 or more files accessible from the same location. • File • A collection of related records stored as a unit on external media. • Record or Row • A collection of related fields such as many pieces of information about a person or thing. • Field • A single (meaningful) piece of information about a person or thing. • Byte • Normally a collection of one or more characters that comprise a field (character ~ byte) • Bit • Commonly written as 0 or 1 as the smallest unit of computer data. • Generally 8 bits = 1 byte, yielding 256 combinations--ASCII code
Data File Hierarchy Example • Folder or Directory • For class files called Courses • File • The file CPT250Grades.txt inside the Courses folder holds all class information for one course. • Record or Row • Each record in the grades file contains Student Name, Student ID, three test grades and ten project grades. • Field • Student ID is comprised of the last 5 digits of the social security number. • Byte • Each character of the ID requires one byte of storage. • Bit • The character “2” of the student ID has the ASCII code 50, which is written as 00110010, where each symbol is a bit.
Physical Details of Files • Before you can write programs that use files, you need to get answers to the following: • Can records be accessed sequentially? • Can records be accessed arbitrarily? • Do all of the records have the same length? • How are the fields ordered on each record? • What is the data type of each field? • Is the length of each field the same? • How are the fields separated?
Different Types of Files • Random-access data files • Files designed to facilitate future retrieval of individual records in an arbitrary order. • Retrieving a record requires determining its location • Hashing • Index (ISAM) • Thus, it is logically used like an array but data resides on disk • Binary data files • A machine-readable format that can contain almost anything • Unreadable if loaded in a word processor or text editor • Program files • Special type of binary file that the OS can interpret with the right OCXs, DLLs, EXEs.
Different Types of Files • Sequential data files • Data elements stored and retrieved one after another, in order • Examples • Text files (Report-Record or Display-Formatted format) • Simple documents such as README files created with ASCII text editor like Notepad • Designed to be read, so there are no field delimiters • Carriage return and line feed mark the end of each line • Comma-separated value (CSV) files • Common means for transferring data across applications • Contains variable length records separated by commas • Certain types of data are surrounded by special characters • Carriage return and line feed mark the end of each line/record • Fixed-width data files • Commonly created from/for COBOL programs
General usage of Files 1) Assign a file number • To refer to file in subsequent file access statements 2) Open the file • Identify file name & path, purpose, reference number, and record length if random file 3) Read/Write the file (possibly within a loop) • Specify reference number, action, and data elements • Read retrieves data from the file into specified memory locations (variables) • Write copies data in specified memory locations (variables) to current location in file 4) Close the file • Specify reference number to release the number and the file
Assigning a File Number • Incorrect Dim InNbr As Integer Dim OtNbr As Integer InNbr = FreeFile ' Returns 1. OtNbr = FreeFile ' Returns 1. Open "Input.dat" For Input As #InNbr Open "Output.dat" For Output As #OtNbr • Correct Dim InNbr As Integer Dim OtNbr As Integer InNbr = FreeFile ' Returns 1. Open "Input.dat" For Input As #InNbr OtNbr = FreeFile ' Returns 2. Open "Output.dat" For Output As #OtNbr • Two choices • Use a literal constant (between 1-511) • Use FreeFile function to have VB automatically assign an available reference number • Use caution when working with multiple files simultaneously
Opening a File • Provide information about the file before any access Open pathname For mode {Access access} {lock} _ As #filenumber {Len = recordlength} • Elements of Open statement • pathname: a string that defines the file name with full path • mode: (purpose: input, output, append, random, binary) • Establishes record pointer at start of file for all modes but append which starts at end of file • access: specifies allowable operations (needed with random) • lock: specifies allowable operations by other processes (n/a) • filenumber: reference number used in subsequent read/write operations • recordlength: number of bytes in a record of a random access file or buffer size for sequential files
Reading from a File • CSV file Input #filenumber, VariableList • Data read from current location of file pointer into corresponding variables • Each comma or end-of-line delimits each field, unless surrounded in quotations • Normally, records were created using the Write statement
Reading from a File • Report (display-formatted) file Line Input #filenumber, VariableName • Read the next line from current location of file pointer into specified variable • Normally, records were created using the Print statement • Fixed record-length file Get #filenumber, {recordnumber}, VariableName • Read the next record from current location of file pointer into specified variable • Needed by fixed-record length files since there is no end-of-line character • Normally, records were created using the Put statement
Detecting End-of-File • Use EOF function • Special function that detects when the end of file has been reached EOF(filenumber) • Use Data sentinel • Record that contains specific, fixed values, which, when read, signal the last record has been processed • Use unique first record that defines the number of records that follow • Atypical approach that allows counting loop to be used to read the fixed number of records • Error handler to detect read past end of file error
Writing to a File • CSV file Write #filenumber, VariableList • Data in variable list written to current file location • Comma inserted between fields, carriage-return and line feed inserted at end of record, quotations placed around strings and # around Booleans and dates • Normally, records will be read using Input statement
Writing to a File • Report (display-formatted) file Print #filenumber, OutputList • Print the output list to next line in file unless previous Print ended in comma or semicolon • May contain expressions, Spc(n), Tab(n), separated by comma or semicolon • Normally, records will be read using Line Input statement • Fixed record-length file Put #filenumber, {recordnumber}, VariableName • Write the data in VariableName to current file location • Normally, records will be retrieved using the Get statement
Closing a File • Notifies the operating system that you are done with the file Close {{#}filenumberlist} • Examples • Close #EmpFile, #TaxFile • Close • For output files, the buffers are flushed to insure that everything gets written to the file. • When reading the same file more than once in a single session, you must open and close each time.
Reading Files with pre-data Open file for input (#1) Input #1, EndVal Set Count = 1 Count > EndVal? False Input #1, Array(Count) True Close #1 Other steps to process in loop Count = Count + 1 • When pre-data is included at start of data file or the program knows the number of records, use For/Next counting loop • Loop condition compares loop counter against the number of records read so far ‘ use static arrays and literal RefNbr Open “file” For Input As #1 Input #1, EndVal For Count = 1 To EndVal Step 1 Input #1, Array(Count) ‘ steps to process repeatedly Next Count Close #1
Reading Files with post-data data Open file for input (#1) Input #1, TempVar Set Count = 0 False TempVar = Sentinel data Count = Count + 1 True Array(Count) = TempVar Close #1 Other steps to process in loop Input #1, TempVar Pre-test loop to check for sentinel data ‘ use static arrays and literal RefNbr Open “file” For Input As #1 Input #1, TempVar Count = 0 Do Until TempVar = SentinelData Count = Count + 1 Array(Count) = TempVar ‘ steps to process repeatedly Input #1, TempVar Loop Close #1
Reading Files with no pre/post data Open file for input (#1) Set Count = 0 EOF(1) False Count = Count + 1 True Input #1, Array(Count) Close #1 Other steps to process in loop Pre-conditional loop to check EOF ‘ use static arrays and literal RefNbr Open “file” For Input As #1 Count = 0 Do Until EOF(1) Count = Count + 1 Input #1, Array(Count) ‘ steps to process repeatedly Loop Close #1 **This is the most likely approach when files are used
How does reading a CSV file work? • Open statement • Record pointer set to start of file (before first record) • Each time an input statement is processed • For each variable listed • Type of data in field must be compatible with the data type of the corresponding variable where it will be stored • Data at file’s record pointer is read and stored in corresponding, named field (variable) • Record pointer moves to the start of the next field (whether on same line or next line) • Close statement to release file to OS
VB Statements to Read CSV File ' Use dynamic array and FreeFile function Dim EmpCount As Integer Dim CsvInFile As Integer CsvInFile = FreeFile ' Returns 1. Open “a:\sample.csv” For Input As #CsvInFile Do Until EOF(CsvInFile) EmpCount = UBound(fEmp) + 1 ReDim Preserve fEmp(0 To EmpCount) As EmpType Input #CsvInFile, fEmp(EmpCount).ID, _ fEmp(EmpCount).Name, fEmp(EmpCount).Rate, _ fEmp(EmpCount).Hours ' optional code to process employee’s data Loop Close #CsvInFile
How does writing a CSV file work? • Open statement • New file created in stated path with given name • Output mode • Record pointer is set to the start of the file • Append mode • Record pointer is set to the end of the file • Each time a write statement is processed • For each variable listed • Data in memory (variable) is written at record pointer • Strings enclosed in double quotes & Boolean and dates enclosed in # • Comma separator used unless last variable in list • Carriage return-line feed mark the end of the line • Close statement to release file to OS
How does reading a report file work? • Open statement • same as writing CSV file • Each time a print statement is processed • Start on a new line unless the previous print ended in a comma “,” or semicolon “;” • For each expression listed • Data is written at current file pointer • Expressions separated by semicolons “;” keep file pointer where it left off • Expressions separated by commas “,” move pointer to next print zone • Close statement to release file to OS
VB Statements to Write CSV File ' Use dynamic array and FreeFile function Dim Index As Integer Dim EmpCount As Integer Dim CsvOutFile As Integer CsvOutFile = FreeFile ' Returns 1. Open “a:\sample.csv” For Output As #CsvOutFile EmpCount = UBound(fEmp) For Index = 1 To EmpCount Step 1 Write #CsvOutFile, fEmp(EmpCount).ID, _ fEmp(EmpCount).Name, fEmp(EmpCount).Rate, _ fEmp(EmpCount).Hours Next Index Close #CsvOutFile
Before using Sequential Files • You must determine or plan its structure • Number of records • To determine type of loop processing & size of array(s) needed • Number, order, and type of fields per record • To determine the fields (generally on one line) • if different types: for the user-defined array • if same type: number of columns for 2-d array • Pre-data • Generally the number of following records (counting loop) • Post-data (Sentinel data) • If present, to flag when the last good data set has been reached • If no pre-data or post-data, may use EOF function or write error handler to determine when end of file reached