Matlab Training Session 10: Loading Binary Data

Matlab Training Session 10:Loading Binary Data • Course Website: • http://www.queensu.ca/neurosci/Matlab Training Sessions.htm

Course Outline Term 1 • Introduction to Matlab and its Interface • Fundamentals (Operators) • Fundamentals (Flow) • Importing Data • Functions and M-Files • Plotting (2D and 3D) • Plotting (2D and 3D) • Statistical Tools in Matlab Term 2 9. Term 1 review 10. Loading Binary Data Weeks 11-14 Topics: Statistics, Creating Gui’s, Exponential curve fitting ….

Week 5 Lecture Outline loading Binary Data A. Week 5 Review – Importing Text Data B. Binary Encoding C. Binary Data Formats D. Exercise

A.Week 5 Review: Importing Text Data • Basic issue: • How do we get data from other sources into Matlab so that we can play with it? • Other Issues: • Where do we get the data? • What types of data can we import?

Lots of options to load files • load for basics • fscanf for complex • textread for any text • xlsread for Excel worksheets

load • Command opens and imports data from a standard ASCII file into a matlab variable • Usage: var_name = load(‘filename’) • Restrictions • Data must be constantly sized • Data must be ASCII • No other characters

load • Works for simple and unstructured code • Powerful and easy to use but limited • Will likely force you to manually handle simplifying data which is prone to error • More complex functions are more flexible

File Handling • f* functions are associated with file opening, reading, manipulating, writing, … • Basic Functions of Interest for opening and reading generic files in matlab • fopen • fclose • fseek/ftell/frewind • fscanf • fgetl

fopen • Opens a file object in matlab that points to the file of interest • fid = fopen(‘filepath’) • fid is an integer that represents the file • Can open multiple files and matlab will assign unique fids

fclose • When you are done with a file, it is a good idea to close it especially if you are opening many files • fclose(fid)

What is a File? • A specific organization of data • In matlab it is identified with a fid • Location is specified with a pointer that can be moved around fid file_name Pointer

Moving the Pointer • We already know how to assign a fid (fopen) • To find where the file is pointing: • x = ftell(fid) • To point somewhere else • fseek(fid,offset,origin) • Move pointer in file fid by offset relative to origin • Origin can be beginning, current, end of file • To point to the beginning • frewind(fid)

Getting Data • Why move the pointer around? • Get somewhere in the file from where you want data • fscanf(fid,format,size) • Format • You have to tell matlab the type of data it should be expecting in the text file so that it can convert it • ‘%d’, ‘%f’, ‘%c’ • Size • You can specify how to organize the imported data • [m,n] – import the data as m by n, n can be infinite • Be careful because matlab will mangle your data and not tell you

Getting Data • fgetl returns the next line of the file as a character array • You may need to convert these to numbers >> fid1 = fopen(‘test1.txt’); >> a_str = fgetl(fid1) a_str = 1 2 >> a_num = str2num(a_str) a_num = [1 2]

B. Binary Encoding • All data files are binary encoded • ASCII text format is generally the easiest because it is relatively simple, easy to visualize in a text editor, and is a common output format BUT • ASCII text is not the fastest or the most efficient way of encoding data • Not all data files are ASCII!

B. Binary Encoding • Binary data consists of sequences of 0’s and 1’s • 10101010101010101000010111110111101011 • Depending on the encoding used, individual meaningful values will occur every 4, 8, 16, 32 or 64 bits • For a tutorial on converting between binary and decimal numbers see: http://www.rwc.uc.edu/koehler/comath/11.html

B. Binary Encoding • Binary data consists of sequences of 0’s and 1’s • 1010 1010 1010 1010 1000 0101 1111 • Depending on the encoding used, individual meaningful values will occur every 4, 8, 16 or 32 bits

B. Binary Encoding • Binary data consists of sequences of 0’s and 1’s • 10101010 10101010 10000101 11110111 • Depending on the encoding used, individual meaningful values will occur every 4, 8, 16 or 32 bits

B. Binary Encoding • Binary data consists of sequences of 0’s and 1’s • 1010101010101010 1000010111110111 • Depending on the encoding used, individual meaningful values will occur every 4, 8, 16 or 32 bits

B. Binary Encoding • Each group of bits can represent a value, character, delimiter, command, instruction ect. • Generally binary data is divided into 8 bit (1 byte) segments • 00000000 = zero • 11111111 = 255 • IT IS VERY IMPORTANT TO KNOW WHAT FORMAT THE DATA IS IN BEFORE YOU CAN READ IT!

ASCII ENCODING • ASCII: American Standard Code for Information Interchange (1968). • ASCII every character is coded by only seven bits of information. The eighth bit is ignored (it can be a zero or one). • ASCII consists of 127 characters which include uppercase, lowercase, spaces and formatting characters • See www.asciitable.comfor the full ascii table

ASCII vs Simple Binary Encoding • ASCII requires 1 byte to be used for every character • Data Table: • 105 124 27 • 101 102 111 • In ascii 1 byte is used for every character, space and carriage return = 23 bytes • If this was encoded in a simple 8 bit binary representation this would only use 11 bytes (1 byte for every number and space)

Binary Precision • The number of bits used to represent a value determines how large or small that value can be • 8 bits 0 to 256 • 16 bits 0 to 65536 • 32 bits 0 to 4.2950e+009 • Precision also determines how many decimal places can be represented

C. Binary Formats: Integers and Characters 'schar' Signed character; 8 bits 'uchar' Unsigned character; 8 bits 'int8' Integer; 8 bits 'int16' Integer; 16 bits 'int32' Integer; 32 bits 'int64' Integer; 64 bits 'uint8' Unsigned integer; 8 bits 'uint16' Unsigned integer; 16 bits 'uint32' Unsigned integer; 32 bits 'uint64' Unsigned integer; 64 bits * The first bit denotes the sign if the integer or character is signed.

Readable Binary Data Formats Floating Point Representation • Used for numbers that require decimal representation (real numbers) • Established by IEEE (Institute of Electrical and Electronics Engineers ) • Encoded in 32 (single precision) or 64 bits (double precision) • Single precision(short): 32 bits 1 bit for the sign, 8 bits for the exponent, and 23 bits for the mantissa. • Double precision(Long) Real: 64 bits 1 bit for the sign, 11 bits for the exponent, and 52 bits for the mantissa.

Readable Binary Data Formats Floating Point Representation • By default matlab stores all values with double precision • The functions realmax and realmin return max and min value representations • 'float32‘, ‘single’ Floating-point; 32 bits • 'float64', 'double' Floating-point; 64 bits

Specifying Machine Formats • The computer system used to record or save the binary data in unique addressing orders • In order to load binary data from a particular system, Matlab needs to know the machine format • You can use the fopen function to determine the machine format • [filename, mode, machineformat] = fopen(fid)

Binary File Machine Formats 'ieee-be' or 'b‘: IEEE floating point with big-endian byte ordering'ieee-le' or 'l' : IEEE floating point with little-endian byte ordering'ieee-be.l64' or 's‘: IEEE floating point with big-endian byte ordering and 64-bit long data type'ieee-le.l64' or 'a‘: IEEE floating point with little-endian byte ordering and 64-bit long data type'native' or 'n' : Numeric format of the machine on which MATLAB is running (the default)'vaxd' or 'd' : VAX D floating point and VAX ordering'vaxg' or 'g' : VAX G floating point and VAX ordering

Reading Binary Data • The function fread() performs all binary data reading in matlab • Syntax • A = fread(fid) • A = fread(fid, count) • A = fread(fid, count, precision) • A = fread(fid, count, precision, skip) • A = fread(fid, count, precision, skip, machineformat) • [A, count] = fread(...)

Reading Binary Data • Input Arguments: • Count: x: read x elements • Inf: read to end of file • [m,n]: read enough to fill a m by n matrix • Precision: Specify input data format eg. Int8, int16, short, • long… see previous slides • Skip: Skip specified number of bits between • segments specified by the Precision argument • MachineFormat: Specify machine format 'ieee-be‘, 'ieee-le‘….. • See previous slides

Exercise • Load and plot position data saved in: week10data.rob • This file contains binary position data saved in 32 bit floating point format precision • Use the fopen function to determine the machine format • hint: [fname, mode, mformat] = fopen(fid) • 2. Load the data using the fread function • 3. Plot the position • 4. Try loading the data with an incorrect argument to see how this changes/corrupts the data

Exercise Solution • fid = fopen('week10data.rob','r') %open file for reading • %Determine file format • [fname, mode, mformat] = fopen(fid) • %Format is ieee-le • %Read binary data • pos_data = fread(fid, inf, 'single', 'ieee-le') • plot(pos_data) % plot position data • fclose(fid) % close file

Getting Help • Help and Documentation • Digital • Accessible Help from the Matlab Start Menu • Updated online help from the Matlab Mathworks website: • http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.html • Matlab command prompt function lookup • Built in Demo’s • Websites • Hard Copy • Books, Guides, Reference • The Student Edition of Matlab pub. Mathworks Inc.

Matlab Training Session 10: Loading Binary Data