70 likes | 228 Views
CS 109 C/C++ Programming for Engineers with MATLAB. Administrative Lab sections meeting Thurs/Friday "Working with datasets, part 2" Final project handout now available Part 1: submit dataset Part 2: submit final project Topic for Today? working with datasets….
E N D
CS 109 C/C++ Programming for Engineers with MATLAB • Administrative • Lab sections meeting Thurs/Friday • "Working with datasets, part 2" • Final project handout now available • Part 1: submit dataset • Part 2: submit final project • Topic for Today? • working with datasets… CS 109 -- 23 April 2014
Reading files in MATLAB: • Does file contain just numbers? load() • Does file contain mixed data (strings, numbers)? dataset() • e.g. spreadsheet-like data with names and values • Reading an Excel spreadsheet?xlsread() • Reading some other file format? • image files? imread( ) • other format? google to see if MATLAB supports… • use low-level "C" functions: fopen( ), fscanf( ), fclose( ) >> imread('cake.ppm'); CS 109 -- 23 April 2014
Dataset( ) function • can read files with different types • can read files with differing numbers of values per line header row Name, Ex1, Ex2, Ex3 Tejas, 100, 98, 100 Venky, 88, 82 Hong, 100, 100, 100 Kaiser, 60, 59, 61 . . missing data data rows dataset('File', 'data.txt', 'Delimiter', ',') data.txt Name Ex1 Ex2 Ex3 CS 109 -- 23 April 2014
Datasets are not matrices… • Dataset( ) function yields a “dataset”, not a matrix • Some advantages — e.g. use column names! But beware ( )… Name, Ex1, Ex2, Ex3 Tejas, 100, 98, 100 Venky, 88, 82 Hong, 100, 100, 100 Kaiser, 60, 59, 61 . . Using ( ) with a dataset yields another dataset >> data = dataset('File', 'data.txt', 'Delimiter', ','); >> exam1 = data(:, 2); % exam1 is column 2: >> mean(exam1) >> whos('exam1') … MATLAB reports exam1 is a dataset … >> exam1 = data.Ex1% column name yields data as a vector: >> whos('exam1') … MATLAB reports exam1 is an 8x1 double (i.e. column vector) … >> mean(exam1) ans = 88.1250 X Error: undefined function 'sum' for input arguments of type 'dataset' Ö
What if you need data from multiple columns? Subset? Name, Ex1, Ex2, Ex3 Tejas, 100, 98, 100 Venky, 88, 82 Hong, 100, 100, 100 Kaiser, 60, 59, 61 . . >> data = dataset('File', 'data.txt', 'Delimiter', ','); >> exams = data(:, 2:4); % ==> dataset >> exams = double(exams); % ==> matrix >> mean(exams) ans= 88.1250 89.5000 NaN >> where = isnan(exams); % logical index of NaN locations: >> exams(where) = 0; % set every NaN to 0: >> mean(exams) ans = 88.1250 89.5000 80.1250 >> nanmean(exams) ans = 88.1250 89.5000 91.5714 CS 109 -- 23 April 2014
Searching datasets works like matrices… • Example: • Output names of students who have failed an exam… >> data = dataset('File', 'data.txt', 'Delimiter', ','); >> where = data.Ex1 < 60 | data.Ex2 < 60 | data.Ex3 < 60; >> students = data(where, 'Name'); % copy from Name column: >> [rows, cols] = size(students); % how many rows matched? >> fori=1:rows fprintf('This student failed an exam: %s\n', students{i, 1}); end This student failed an exam: Kaiser This student failed an exam: Das Name Use { } when accessing a single element of a dataset…
In-class exercise… • Output name of state with largest total rainfall in 2013? Hint #1: google about sum( ) function, it can sum columns or rows… id,name 1,Alabama 2,Arizona 3,Arkansas . . . 50,Wyoming id,jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec 1,3.18,2.57,2.22,7.95,6.47,3.12,2.19,2.52,1.93,5.69,2.94,1.54 2,2.16,1.39,2.17,2.63,4.32,1.07,3.78,6.06,1.61,3.21,1.04,2.09 3,0.71,3.46,2.32,5.73,5.32,7.41,5.44,3.94,3.89,2.24,3.65,2.51 . . . 50,3.88,0.60,4.72,3.87,2.26,5.75,2.12,1.36,3.20,2.31,1.28,2.04 >> rainfall = dataset('File', 'rain2013.txt', 'Delimiter', ','); >> states = dataset('File', 'states.txt', 'Delimiter', ','); rain2013.txt states.txt CS 109 -- 23 April 2014