260 likes | 344 Views
Scientific Computing Winter 2014 Chapter 5 Files and Scripts. Computer Science 121. Files and Scripts. File (non-technical): (Word) document, image, recording, video, etc. File (technical): a named collection of bytes on disk. ASCII vs. Binary
E N D
Scientific Computing Winter 2014 Chapter 5 Files and Scripts Computer Science 121
Files and Scripts • File (non-technical): (Word) document, image, recording, video, etc. • File (technical): a named collection of bytes on disk.
ASCII vs. Binary • “ASCII file” means “file that can be viewed as text by a program (Notepad) that interprets each byte as an ASCII code”. • Binary file is anything that cannot be viewed that way • “JPEG file” means “file that can be viewed as an image by using a program (Photoshop) that interprets the bytes as JPEG-encoded image. • “MP3File” means “file that can be watched/heard as a video/audio recording by using a program that interprets the bytes as an MP3-encoded video / audio stream”. • “Foo File” means “file whose contents can be experienced by using a program that interprets the bytes as a Foo encoding”. • XML (eXtensible Markup Language) is an attempt to compromise between binary and ASCII: make all data human-readable
5.1 Filenames • General format: name . extension • For historical reasons, extension is usually three characters. • Extension tells OS what program to use to open file (MS Word, Excel, Matlab, ...)
foo.m sort.m 011010 110101 OMFG.jpg 000100 hamlet.doc 111011 Aside: File Deletion • Q.: What happens when you “delete” a file? • (Drag OMFG.jpgto trash and empty trash…)
foo.m sort.m 011010 110101 hamlet.doc 111011 Aside: File Deletion • A.: What appears to happen...
foo.m sort.m 011010 110101 000100 hamlet.doc 111011 Aside: File Deletion • A.: What actually happens ... • Then use WinUnDelete (e.g.) to get back OMFG.jpg
Directory Structure • Directories (folders) are organized hierarchically (one inside another) • So we are forced to choose a single organization method (like library with card catalog indexed only by author) • But we can use links (shortcuts) to add additional organization, without copying files.
Pathnames • Pathname is “full name” of directory in a linear form • e.g., C:\MyDocuments\cs121\myproj\new\ • Complete filename includes path • e.g.,C:\MyDocuments\cs121\myproj\new\myprog.m • This becomes important because of the ...
Working Directory >> pwd % print working directory ans = C:\MATLAB\work • Without extra effort, we can only access files in our working directory >> myprog % run myprog.m script ERROR: myprog? LOL!!
Working Directory • Solutions • Make shortcuts from working directory (annoying) • >> cd('C:\MyDocuments\cs121\myproj\new\') >> myprog ERROR: Can't find someOther.m… loser! • Use Matlab File menu to add paths: File / Set Path...
How Matlab Uses Paths • When we type a name foo into the interpreter, Matlab follows this sequence: 1. Looks for foo as a variable. If not found, ... 2. Looks in the current directory for a file named foo.m. If not found, ... 3. Searches the directories on the MATLAB search path, in order, for foo.bi (built-in function) or foo.m. If not found, ... 4. Reports ERROR
5.2 File operators • File write/read operators allow us to save/restore values from previous Matlab sessions. • File / Save Workspace As... is simplest way to do this – saves everything to a .mat file • If we want to save/restore specific variables, we can use the save and load commands:
5.2 File operators >> a = 'foo'; b = 2; c = pi; >> save myvariables a b >> clear >> load myvariables >> who Your variables are: a b • I never use the other syntax (>> save('myvariables', 'a', 'b' )
5.3 Importing and Exporting Data • Often want to get data from other programs (Excel, LabView, text editor) into Matlab, and save data in a format that other programs can read. • Excel saves data in binary, proprietary (of course!) .xls format
5.3 Importing and Exporting Data • Generally, other formats will all be text-based (ASCII) • .csv : comma-delimited values (no commas in vals) • .dlm : other delimiter (allows commas in vals) • .xml : eXtensible Markup Language (newer)
YES NO Spreadsheet data should have all cells filled (“flat format”), or Matlab will get confused:
5.3 Importing and Exporting Data csvread operator allows us to read numerical data, but we need to cut off the header in the file: Remove it by hand from the file: >> d = csvread('sunspots-no-header.csv'); Specify # of lines to cut ignore in cvsread: >> d = csvread('sunspots.csv', 1); % ignore first line
5.3 Importing and Exporting Data >> d = csvread('sunspots.csv', 1) d = 1749 1 58 1749 2 62.6 1749 3 70 etc.
5.3 Importing and Exporting Data • importdata command is useful for heterogeneous data. • Returns a data structure: >> d = importdata('sunspots.csv') d = data: [2820x3 double] textdata : {'Year', 'Month', ... colheaders : {'Year', 'Month', ...
Non-numerical ASCII Files • txt files : anything we want to treat as text (ASCII characters) >> fid = fopen('mobydick.txt'); >> s = fread(fid); >> fclose(fid) >> s s = 32 67 97 ... % need to munge this
Treatas strings Non-numerical ASCII Files >> s = char(s') % transpose, textify ans = Call me Ishmael. Some years ago-never mind how long precisely -having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.... textread does this for us, and tokenizes words into cell array: >> s = textread('mobydick.txt‘, ‘%s’) s = {‘Call’, ‘me’, ‘Ishmael.’, …
5.4 Scripts • You know most of this stuff already ☺ • You can run a script (e.g., myprog.m) from the interpreter: >> myprog • Tips • Don't name any variablesmyprog • Don't use any blank spaces in script names • Re-read search path stuff from a few pages back
5.5 Scripts as Computations • Scripts are (mostly) like typing directly into the interpreter – so variables can get overwritten • This also means that there is no ansvalue: >> x = myprog ERROR: loser trying to execute SCRIPT myprog as a program. • Nor can we pass arguments: >> myprog(7) ERROR: My name is Donnie, and you suck at Matlab.