260 likes | 620 Views
Introduction to MATLAB. Ethan Katz-Bassett ethan@cs November, 2007. What is MATLAB?. MATrix LABoratory Numerical computing environment Mathematical functions Visualizations Programming language. What is MATLAB good for?. Easy matrix manipulation/ data exploration
E N D
Introduction to MATLAB Ethan Katz-Bassett ethan@cs November, 2007
What is MATLAB? • MATrix LABoratory • Numerical computing environment • Mathematical functions • Visualizations • Programming language
What is MATLAB good for? • Easy matrix manipulation/ data exploration • Implementation of numerical algorithms • Plotting of functions/data • User interfaces?? • Interfacing with programs in other languages?
Some things I’ve used it for • Plotting graphs • Plotting geographic data on maps • Writing various numerical algorithms • Solving linear programs • Defining and solving semidefinite programs on lots of variables • K-means and other clustering
MATLAB in the Department • /projects/matlab/bin/matlab • Includes many toolboxes (signal processing, optimization, statistics, …), but not all • Limited number of licenses • /projects/matlab/etc/lmstat -A • MATLAB compiler (mcc)? • Licenses for Mac betas available (at least last January). See Mac-users archives • Some people have Windows copies • Licenses available for purchase-- $100 plus
Ways to use it • Interactive interpreter in Command Window • Writing functions/ scripts as .m files • Passing scripts from the Linux shell
MATLAB Language diary('~ethan/matlab_tutorial/diary.txt'); diary on; • Looks “clean,” like math • Dynamically typed • Assign without declaring type, type can change • Semicolons suppress output
Vectors/ matrices I >> B = [ 1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16] B = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 >> B(2,3) ans = 7 >> B(3,:) ans = 9 10 11 12 >> B(2:4, 3:4) ans = [ 7 8; 11 12; 15 16] >> zeros(1,4); >> ones(2,2);
Vectors/ matrices II >> A = 1:4 A = 1 2 3 4 >> A’ ans = 1 2 3 4 >> C= repmat(A,3,2) C= 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
Vectors/ matrices III >> D=sum(B) D = 28 32 36 40 >> power(B(1:2,3:4),2) ans = 9 16 49 64 >> E=[A;B]; (concatenation) >> B(2,:)=[] B = 1 2 3 4 9 10 11 12 13 14 15 16 >> (1:3)*(4:6)’ VS >> (1:3).*(4*6)
Vectors/ matrices continued • What MATLAB is optimized for • Everything is a matrix! (sort of) • Multi-variate data • Most functions work on matrices (more on this later)
Loading and saving • save/load to save/load workspace or variables/structures in .mat binary formatsave filenamesave(‘filename’,’var1’,’var2’,…)load filenameload filename X Y Z • dlmwrite/dlmread to write/read matrix to/from delimited ASCII filedlmwrite(filename,M,’D’);M=dlmread(filename,’D’);
Graphing >> x = 0:pi/20: 2*pi; >> y = sin(x); >> plot(x,y,’b-x’); % blue line, X’s >> hold; % keep lines >> z = cos(x); >> plot(x,z,’r:+’); % dotted red, +’s >> hold; % release lines >> data=importdata(‘data_unsorted.txt’); >> plot(sort(data),(1:128)/128); % cdf
Control flow I if any(B>5) % NOT if B>5 ‘true’ else ‘false’ end; for b=B’ % iterate over columns 2*b’ End;
You know those for loops I showed? Never use them! Or at least avoid when possible. MATLAB meant for matrices-- vectorize code P = [] for a=A % slow P = [P (a>0)]; end P2 = A>0; % fast
Original code: numAss = 0; % num assigned numAssCorr = 0; % and should be for I=1:numel(Actual) if Assigned(i) == 1 numAss = numAss + 1; if Actual(i) == 1 numAssCorr = numAccCorr + 1; end; end; end precision = numAssCorr/numAss;
Original code: numAss = 0; % num assigned numAssCorr = 0; % and should be for I=1:numel(Actual) if Assigned(i) == 1 numAss = numAss + 1; if Actual(i) == 1 numAssCorr = numAccCorr + 1; end; end; end precision = numAssCorr/numAss; Vectorized code: numAss = sum(Assigned); numAssCorr = sum( Assigned & Actual ); precision = numAssCorr/numAss;
In: m = num of clusterings n = num points to be clustered, n>>m E = m X n set of clusterings, where E(i,j)=cluster of point j in cluster i Out: C = n X n similarity matrix, where C(j,k)= number of i with E(i,j)=E(i,k) Naïve code (or, non-MATLAB code): C = zeros(n,n); for i = 1:m for j = 1:n for k = 1:n C(j,k) = C(j,k) + (E(i,j)==E(i,k)); end;end;end Can reorder for loops without affecting performance
In: m = num of clusterings n = num points to be clustered, n>>m E = m X n set of clusterings, where E(i,j)=cluster of point j in cluster i Out: C = n X n similarity matrix, where C(j,k)= number of i with E(i,j)=E(i,k) Vectorize, try 1-- do all comparisons for one point at once: C = []; % should hoist allocation for clusters=E % get clusters for % one point rep = repmat(clusters,1,n); % replicate clusters C = [C; sum(rep==E)]; % compare to other % points end Much better (3.3x faster on 20x1000)! Can we improve?
In: m = num of clusterings n = num points to be clustered, n>>m E = m X n set of clusterings, where E(i,j)=cluster of point j in cluster i Out: C = n X n similarity matrix, where C(j,k)= number of i with E(i,j)=E(i,k) Vectorize, try 2-- n>>m, so iterate over m, not n C = zeros(n,n); for clustering=E’ % get one clustering rep = repmat(clustering,1,n); % create copies of % clustering C = C + (rep==rep’); % compare within % clustering end Turned out to be slower on 20x1000, but consider changes like this.
Scripts and Functions • Save as .m files (and make sure in path) • See ~ethan/matlab_tutorial/ensemble[1-3].m _____ function [C,time] = ensemble2( m, n, E ) tic; … % code from earlier slide time=toc; _______ [C2,t2]=ensemble2( 20, 1000, Ensemble);
Scripting MATLAB from Linux Shell 2 options I’ve used: 1) /projects/matlab/bin/matlab -nojvm -nodesktop -nosplash -nodisplay < myscript.m • .m file specifies input and output values/filenames 2) /projects/matlab/bin/matlab -nojvm -nodesktop -nosplash -nodisplay -r "A=3+3;dlmwrite('test.txt',A,' ');quit"
Scripting MATLAB from Shell, II ensemble2_wrap.m: Dims = load(’dims.txt'); E= load(‘ensemble.txt’); C=ensemble2(Dims(1),Dims(2), E); dlmwrite(‘similarity2.txt’,C,' '); quit; __________________________________ ~ethan/matlab_tutorial$ /projects/matlab/bin/matlab -nojvm -nodesktop -nosplash -nodisplay < ensemble2_wrap.m
Scripting MATLAB from Shell, III ensemble2_wrap2.m: function [] = ensemble2_wrap2(dimsfn, ensfn, outfn) Dims = load(dimsfn); E=load(ensfn); C=ensemble2(Dims(1),Dims(2),E); dlmwrite(outfn,C,' '); _____________________________________ ~ethan/matlab_tutorial$ /projects/matlab/bin/matlab -nojvm -nodesktop -nosplash -nodisplay -r "ensemble2_wrap2('dims.txt','ensemble.txt','similarity2.txt');quit"
Weirdness and limitations • One-based indexing hard coded • Parentheses used for calling a function AND for indexing an array • No namespace resolution, so same code on different machines (w different path variables) can produce different results • Many functions have different behavior for matrices vs vectors (so “base case” broken) • No references • Limits data structures (linked list) • All functions “call by value”
MATLAB Functions • Tons implemented • Sometimes hard to find what you need… • But great documentation once you do: help functionname http://www.mathworks.com/access/helpdesk/help/helpdesk.html • Many additional found online