740 likes | 945 Views
Running MATLAB Computations on the CCR u2 Cluster Cynthia Cornelius cdc@ccr.buffalo.edu March 30, 2009. What is the u2 cluster?. The u2 cluster is a group of 1056 computers on private network.
E N D
Running MATLAB Computations on the CCR u2 Cluster Cynthia Cornelius cdc@ccr.buffalo.edu March 30, 2009
What is the u2 cluster? • The u2 cluster is a group of 1056 computers on private network. • These compute nodes are DELL XEON dual processor machines running the RedHat Linux Operating System. • A batch queuing system schedules user applications to run on the compute nodes. • Users submit jobs to the scheduler. • Users may also run interactive applications through the scheduler.
What is the hardware layout? • Front-end machine • Accessible from UB Network. • The front-end machine is used for file transfer, editing files, compiling and debugging codes, submitting jobs, and very short computations. • Compute nodes • Accessible only within the cluster. • most of the nodes have 2GB memory • 64 compute node have 4GB • 32 compute nodes have 8GB • Users request nodes from the batch queuing system.
What is the hardware layout? • Storage • User home directories are accessible on all compute node. • User quotas are 2GB. • Temporary storage is available. • Compute nodes have 60GB of local disk space. • Networks • Accessible from within the cluster. • All compute nodes and the front-end machine are connected by a gigabit ethernet network. • 786 computer nodes are connected by Myrinet, a high speed fiber network.
Why use the u2 cluster? • The u2 cluster provides tremendous computing power for any scientific computation. • Typically, hundreds of jobs are running at one time. • The average wait time for jobs requiring one compute node is less than one hour. • Usually 10-15 minutes for 1 processor jobs. • The job throughput allows scientists to complete computations in days instead of months. • machine status • estimated wait time
Why use the u2 cluster? • Machine status
Why use the u2 cluster? • Estimated wait time
How do I use the cluster? • Login to the u2 front-end. • Transfer files to u2. • Running MATLAB on u2: • interactively on the front-end machine • interactively on a compute node • batch job on a compute node • batch job with a compiled m-file • compiled m-files do not require a license to run. • The Parallel Toolbox is available. • run up to 4 labs (workers) on a compute node
Login and file transfer • U2 is only accessible from the UB network. • Required software for all users: • Secure Shell to login • Secure File Transfer to upload files to u2 • X-Display to allow the MATLAB graphical interface to display • VPN to connect with the UB network from off campus • Download and install UBVPN • Logout of u2 using “logout” or “exit”
Login and file transfer • Usually ssh, sftp, and X-11 are already installed on Linux/Unix machines. • Login: ssh –X u2.ccr.buffalo.edu • ssh –Y u2.ccr.buffalo.edu • ssh –X username@u2.ccr.buffalo.edu • File transfer: sftp u2.ccr.buffalo.edu • put to upload a file to u2. • get to download a file from u2. • mget and mput will transfer multiple files • man pages for ssh and sftp • man ssh
Login and file transfer • Typically PuTTY, X-Win32 and WinSCP must be installed on Windows machines. • Download and install PuTTY • Download and install X-Win32 • Download and install WinSCP • Login: launch X-Win32, then PuTTY • enter in u2.ccr.buffalo.edu and username • enable X-11 forwarding • File transfer: launch WinSCP • enter in u2.ccr.buffalo.edu and username • drag and drop interface
Login and file transfer • Login from Windows machine.
Login and file transfer • Login to u2 with X-Display
Login and file transfer • Verify X-Display
Login and file transfer • File transfer from Windows machine.
Cluster Environment • The u2 cluster runs RedHat Linux operating system. • command line interface • Using the U2 cluster requires knowledge of a few basic UNIX commands. • short list of UNIX commands • CCR Linux/UNIX Reference Card provides a more extensive list • More information on the user environment • Windows users should check the notes.
Starting MATLAB • Login to u2 with X-Display enabled • Verify X-Display with xclock • cd to working directory • List available MATLAB installations: • module avail matlab • Load the matlab module: • module load matlab/R2008a • Launch MATLAB • matlab • There is a 30 minute CPU limit for applications running on the u2 login machine.
Starting MATLAB • Launch MATLAB
Running on the U2 Cluster • The compute nodes are assigned to user jobs by the PBS (Portable Batch System) scheduler. • Jobs can be interactive or in batch mode. • All jobs wait for nodes to be assigned by the PBS scheduler. • Interactive jobs will wait at the prompt. • All jobs are listed in the queue. • Estimated start times are available for jobs in the queue.
Execution Model Schematic SCHEDULER pbs_server qsubmyscript No Yes Run? $PBS_NODEFILE prologue epilogue $USER login node1 myscript node2 nodeN
Submitting an interactive job • qsub -I -X -q debug -lnodes=1:ppn2 -lwalltime=01:00:00 • -I is interactive • -X is enable X-Display • -q debug requests the debug queue. • The debug queue has 32 dedicated nodes from 9am-5pm M-F. There is also a maximum time of 1 hour. • The default queue (ccr) has a maximum of 72 hours. • -lnodes=1:ppn2 requests 1 node and both processors. • -lwalltime=01:00:00 requests 1 hour.
Sample interactive job • qsub –I –X –q debug –lnodes=1:ppn=2 –lwalltime=01:00
Sample interactive job • qub –I –X –lnodes=1:ppn=2 –lwalltime=04:00:00
PBS Commands • Submit a job: • qsub pbs-script (or –I and options for interactive) • List jobs: • qstat –an • qstat –an –u username • qstat –an jobid • showq • Delete a job: • qdel jobid • Show estimated start time for a job: • showstart jobid • Show nodes that are currently free: • showbf -S
PBS Commands • qstat –an
PBS Commands • showq
PBS Commands • showq
PBS Commands • qstat –an –u username
PBS Commands • qstat –an jobid
PBS Commands • showstart jobid • estimated start time for the job
PBS Command • showbf –S • shows currently free nodes
PBS Variables • $PBS_O_WORKDIR - directory from which the job was submitted. • By default, a PBS job starts from the user’s $HOME directory. • $PBSTMPDIR - reserved scratch space, local to each host (this is a CCR definition, not part of the PBS package). • This scratch directory is created in /scratch and is unique to the job. • The $PBSTMPDIR is created on every compute node running a particular job. • $PBS_NODEFILE - name of the file containing a list of nodes assigned to the current batch job. • Used to allocate parallel tasks in a cluster environment
Interactive MATLAB job • Login to u2 with X-Display enabled. • Use qsub to request a node: • qsub –I –X –lnodes=1:ppn=2 –lwalltime=01:00:00 • Check X-Display from compute node with xclock • cd to working directory • Load the module for matlab • module avail matlab • Module load matlab/R2008a • Launch MATLAB • matlab
Batch job • Create the m-file of MATLAB commands that you want to execute: • (examples from MATLAB documentation) • This will create a 3x3 magic square • n =3 • m = magic(n) • Create a PBS script file to submit to the scheduler. • Submit the job: • qsub pbs-script
Batch job • PBS script to run the m-file
Batch job • qsub and view output file
Batch job • view output file
Batch job • The problem with running MATLAB in batch mode is that a license may not be available when the job starts on a compute node. • Currently there are 4 MATLAB licenses for the u2 cluster. • You can check the status of the MATLAB licenses with the following: • lmstat -a -c /util/matlab/etc/license.dat • The best solution is to compile the m-file. The resulting executable does not require a license at runtime.
Compiling the m-file • The m-file must be converted into a function: • function m = function-name • This is added to the beginning of the m-file • This is simple example, but usually that is all it takes to convert the m-file to a function. • The new m-file function: • function m = mymagicsquare • n = 3; • m = magic(n)
Compiling the m-file • mcc options: • Generate C stand-alone application: • -m • Verbose: • -v • No runtime Java libraries: • –R nojvm –R nojit • Help with mcc: • mcc –help • The compilation will produce a number of files including an executable and a run script that will be used in the pbs-script.
Compiling the m-file • Command to compile the m-file: • mcc –m –v –R nojvm –R nojit m-file
Compiling the m-file • Executable and run script are created