320 likes | 440 Views
Introduction to PSC Computing Systems. Alex Ropelewski ropelews@psc.edu MARC: Developing Bioinformatics Programs July 17-28, 2006. Computers Available for Biomedical Use. PSC operates two platforms exclusively for biomedical use: A 20 compute node Opteron cluster
E N D
Introduction to PSC Computing Systems Alex Ropelewski ropelews@psc.edu MARC: Developing Bioinformatics Programs July 17-28, 2006
Computers Available for Biomedical Use • PSC operates two platforms exclusively for biomedical use: • A 20 compute node Opteron cluster • Contains one dual-cpu 1.4 Ghz AMD Opteron processor per node • 4 Gbytes of memory per node. • Two front ends: CODON and BIOINFORMATICS. • A 64 processor SMP machine • 64 1.15 GHz EV7 processors • 256 Gbytes of shared memory. • Machine is called JONAS National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Computers Available for General Use (Including Biomedical) • bigben, a Cray XT3 MPP machine with 2068 compute processors. • lemieux, an HP Alphaserver Cluster comprising 750 4-processor compute nodes. • rachel, an SMP machine. Each machine has 64 1.15 GHz EV 7 processors and 256 Gbytes of shared memory. • ben, an HP Alphaserver cluster comprising 64 4- processor, 4-Gbyte compute nodes. • Front end machines running Linux and VMS • A file archiver National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Access to PSC High Performance Computing Systems • Access for academic research and coursework use is through a grant process. • To apply for a grant visit: • http://www.psc.edu/nrbsc/resources/ • One grant per project • Additional users can be added to a grant National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Consulting • All active PSC users have access to PSC consulting resources: • 800-221-1641 • Phones are staffed Monday - Friday, 9 a.m. to 8 p.m. and Saturday, 9 a.m. to 4 p.m. (EST). • For best service, call for critical problems. • remarks@psc.edu • There is also documentation available at www.psc.edu National Resource for Biomedical Supercomputing - An NIH Supported Research Center
General Policies • The PSC has policies on computing related topics such as: • Passwords • File retention after grant expiration • Email addresses • To review these policies please see: • http://www.psc.edu/general/policies/policyoverview.html National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Passwords • Computer security depends heavily on maintaining secrecy of passwords • Most machines use a common Kerberos password: • Must be at least 6 characters long. • Longer than 8 characters can prevent you from logging in certain machines. National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Selecting Secure Passwords • Do NOT • simply add numbers to words that can be found in a dictionary, such as "helper01", "amoeba1", "1license" • simply substitute "1" for "L" or "0" for "o" or "1" for "I" in common words to get passwords like "he1per" or "am0eba" or "11cense" • Creating good passwords: • use first letter from an uncommon sentence/phrase that you can easily remember: • I married Sandie on July 2nd in Greentree (ImSoJ2iG) • My 4thgrade teacher was Sister Cyrilla: (M4gtwSC) National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Connecting and Transferring Files • Connect to the PSC machines using ssh • http://www.psc.edu/general/net/ssh/ssh.html • Transfer files between PSC and your home institution using kftp, scp or sftp National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Opteron Cluster • Contains bioinformatics software and databases • To log into the cluster, ssh to: • bioinformatics.psc.edu • codon.psc.edu • The cluster uses a UNIX operating system • SLURM is used to run serial and parallel programs on the clusters nodes National Resource for Biomedical Supercomputing - An NIH Supported Research Center
SLURM scripts • A file containing a series of instructions for the computer • SLURM scripts are submitted by the user and run when the system has resources available to run the script • SLURM scripts can run parallel programs or serial programs • A SLURM script will be created for you for sequence analysis codes when you run the program makseq National Resource for Biomedical Supercomputing - An NIH Supported Research Center
SLURM commands • srun – submit a script file to the SLURM scheduling queue • squeue – show status of the SLURM scheduling queue • scancel – remove a running National Resource for Biomedical Supercomputing - An NIH Supported Research Center
SLURM - srun % srun –b –o test.log test.d srun: jobid 3197 submitted % squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2773 all pgy347_t jshen3 R 3-02:13:43 1 operon20 3194 all test.a ropelews R 2:21 1 operon11 3195 all test.b ropelews R 2:21 1 operon13 3196 all test.c ropelews R 2:21 1 operon14 3197 all test.d ropelews R 2:10 1 operon16 National Resource for Biomedical Supercomputing - An NIH Supported Research Center
SLURM - scancel % squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2773 all pgy347_t jshen3 R 3-02:13:43 1 operon20 3194 all test.a ropelews R 2:21 1 operon11 3195 all test.b ropelews R 2:21 1 operon13 3196 all test.c ropelews R 2:21 1 operon14 3197 all test.d ropelews R 2:10 1 operon16 % scancel 3195 % squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2773 all pgy347_t jshen3 R 3-02:14:35 1 operon20 3194 all test.a ropelews R 3:13 1 operon11 3196 all test.c ropelews R 3:13 1 operon14 3197 all test.d ropelews R 3:02 1 operon16 National Resource for Biomedical Supercomputing - An NIH Supported Research Center
UNIX • To use UNIX, for sequence analysis one needs to become familiar with three basic areas: • General information on UNIX • UNIX commands and syntax • Text editor (such as vi, emacs, pico) • This talk presents the minimum that one needs to know in those areas National Resource for Biomedical Supercomputing - An NIH Supported Research Center
General Information • Commands are organized into “shells”: • sh, csh, ksh, tcsh • Shells can have different commands and different command syntax • Core UNIX commands work the same regardless of shell • Commands are case sensitive • General command syntax is: command -options parameters • Some commands can be listed in special files, which are executed when conditions warrant such as: .login and .cshrc and .profile National Resource for Biomedical Supercomputing - An NIH Supported Research Center
UNIX File and Directory Structure • Hierarchical (absolute) • No Special Filename Format • Filenames are case sensitive • Single dot . refers to the current directory • Double dots .. refers to the parent directory • $HOME refers to the login directory National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Special Characters • Wildcard characters: * ? [letters] • Home/user Directory: ~ ~user • IO Redirection: <stdin;>stdout;>&stdin+stderr • Concatenate >> • Place job in background: & • Redirect output from a command as input into another command (pipe): | • Stop a job: [control] z • Stop executing: [control] c National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX Commands • kpasswd (passwd) - Change your password • ls - List files in a directory • more - Display contents of a file • cp - Duplicate files • rm, rmdir - Remove a file or directory • mkdir - Create a directory • cd - Change directory • pwd - Show directory • man - Find Unix command usage information National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - kpasswd • kpasswd (passwd) – Change Kerberos Password % kpasswd ropelews@PSC.EDU's Password: New password: Verifying password - New password: % National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - ls • ls - List files in a directory • -l Long format • -a Show hidden files • -F Tag files with "/", "*", or "@" % ls a.doc a.cpr a.out FILE National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - more • more - View contents of file by page % more file.f program intro integer I, J, K real rr,vv,cc parameter (I = 5) : : National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - cp • cp - Duplicate files. % ls a.dat x.dat % cp x.dat xcopy.dat % ls a.dat x.dat xcopy.dat National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - rm • rm, rmdir - Remove a file or a directory • -i inquire before remove • -r recursive remove % ls x.dat xcopy.dat z.file % rm *.dat % ls z.file National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - directory • Directory navigation commands • mkdir - Create a directory • cd - Change directory • pwd - Show directory % mkdir sub1 % mkdir $HOME/sub2 % cd sub1 % pwd /usr/ue/2/ropelews/sub1 % cd $HOME/sub2 % pwd /usr/ue/2/ropelews/sub2 National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Basic UNIX commands - man • man - Find Unix command information • man -k <keyword> - Find topics available • man <command> - Show command information % man -k directory mkdir (1) - make directories rm (1) - remove files or directories rmdir (1) - remove empty directories % man rmdir RMDIR(1) User Commands RMDIR(1) NAME rmdir - remove empty directories SYNOPSIS rmdir [OPTION]... DIRECTORY... DESCRIPTION Remove the DIRECTORY(ies), if they are empty. : National Resource for Biomedical Supercomputing - An NIH Supported Research Center
UNIX Text Editors • emacs – GNU UNIX editor • vi – Traditional UNIX editor • pico –A simple editor • To use full-screen capabilities, terminal type usually needs to be a set to a “vt100” • setenv TERM vt100; tset vt100 National Resource for Biomedical Supercomputing - An NIH Supported Research Center
Which Editor Should You Use? • Use the editor that you are most familiar with! • emacs: • Powerful, works on Unix and some non Unix systems • Moderately easy to master • vi • Powerful, will be on every Unix system • Not intuitive, fairly difficult to master. • pico • Simple, intuitive, easy to learn National Resource for Biomedical Supercomputing - An NIH Supported Research Center
emacs • To Edit a file named <filename> enter: • emacs <filename> • To navigate: • <arrows keys> - Move cursor 1 space • <delete> - Delete character • To quit with or without saving: • <cntrl> X <cntrl> C • Then answer Y or N • For more information see: • http://www.gnu.org/software/emacs/ National Resource for Biomedical Supercomputing - An NIH Supported Research Center
vi • To Edit a file named <filename> enter: • vi <filename> • vi has two modes “navigation” mode (default) and “insertion” mode • To insert text, one must be in “insertion” mode. Several keys (i,a,o) will place you into insertion mode. • To leave the insertion mode, hit [esc] key. National Resource for Biomedical Supercomputing - An NIH Supported Research Center
vi (continued) • Commonly used vi keys: [arrows] - Move cursor dd - delete line h - Move cursor left dl - delete letter l - Move cursor right dw - delete word k - Move cursor up [esc] - stop insertion j - Move cursor down :wq - write then quit i - insert at cursor :q! - quit a - insert after cursor o - insert below line National Resource for Biomedical Supercomputing - An NIH Supported Research Center
pico • Based on editor in the Pine email program • To edit a file named <filename> enter: • pico <filename> • To navigate: • <arrows keys> - Move cursor 1 space • <delete> - Delete character • To quit with or without saving: • <cntrl> X • Then answer Y or N National Resource for Biomedical Supercomputing - An NIH Supported Research Center