670 likes | 771 Views
Introduction to Perl. Thaddeus Aid IT Learning Programme University of Oxford 15/04/2014. About the course. In these three sessions we will cover: Basic Programming Concepts Perl Syntax Data input, manipulation, and output
E N D
Introduction to Perl Thaddeus Aid IT Learning Programme University of Oxford 15/04/2014
About the course • In these three sessions we will cover: • Basic Programming Concepts • Perl Syntax • Data input, manipulation, and output • Provide a foundation to learn more Perl or other programming language • Provide a starting point for your own programs • This class is designed to provide a supported learning environment • Go at your own pace
What the course is not • A complete review of Perl • An in-depth introduction to programming • A “taught” course
What is Perl? • High-level. Strong abstraction from the details of the computer, some natural language elements, easy-to-use, etc. • General-purpose. People have done almost everything imaginable with a computer using Perl! • Interpreted. Perl translates source code into an efficient intermediate representation which is then immediately executed. • Dynamic. Perl executes at runtime many behaviours that other low-level languages might perform during compilation. • Free Software: Perl is available for usage without charge.
A brief history of Perl • Before Perl: • C/C++/Fortran/COBOL: Compiled languages that were platform specific • Shell Scripts: Basic automation of Operating System tasks • Utilities (AWK/SED/GREP): Used to process text files • There was a need by systems administrators to simplify things • Perl was born in 1987 • Perl 5 (the current version) was released in 1994
What is Perl used for? • Text processing. Designed as a Unix-based system for report processing (systems administration). • Web. Perl was one of the few programming languages at the time suitable for processing the highly textual content of the web. • Science. “Genomics” meant we needed a tool to process data involving DNA sequences. • Many enthusiasts have created extensions for Perl that allow its use in almost any domain but check that it really is the right language for you!
Executing a Perl program • What do you need? • Source Code – a text file containing the instructions for the computer • The Perl Interpreter – perl (available at http://www.perl.org) • (Optional) Data Files • To execute type perl myprogram.pl “.pl” is the standard extension for a Perl program like “.doc” or “.exe” • No executable program will be generated
Alternatives to Perl • R/MATLAB/Octave/SciPy: Mathematical Computing. In Perl v5, this is currently done via an external module (PDL) but some basic functionality will be directly incorporated into v6. • Python. Very powerful and popular alternative to Perl. • Compiled languages. Speed and maximum performance.
Session Plan • Session 1 • Hello World! – Your first Perl program • Variables (Part 1) – Scalars and Arrays (1d and 2d) • Conditional Statements (if/else) • Loops (foreach/while) • Session 2 • File handling • Regular Expressions (RegEx) • Variables (Part 2) – Hashes • Session 3 • Functions
Hello World! • Comments – “#” Notes to the programmer from the programmer • #!/usr/bin/perl – A special message to the operating system • Often you will need to remind yourself of what is happening or you will need to explain something to another future programmer • Pragma – “use” Sets environmental conditions for the program • Functions – “something()” Reusable sections of code • Sometimes the () is omitted for special functions such as “print” • End of line – “;” The semicolon defines the end of a command/line
Escape Characters • Some text mark up requires special codes • \n – New Line • \t – Tab • \\ – Normal Backslash • \” – Some times required to use double quote (use inside of strings) • \’ – Some times required to use single quote • More information at: http://perldoc.perl.org/perlrebackslash.html
Variables (Part 1) – Scalars • Numbers • Integers • Floating Point Numbers • Strings • Chunks of text • “this is a string” • “this is another string” • “This string has a number 182939” • “Thad said \“Hi Everyone!\””
Scalars (cont.) • To create a Scalar variable use “$” • $variable = “something”; • To print a Scalar use the print function • print $variable; • print “$variable\n”;
Numeric Operations • Perl can perform all normal mathematical functions on number scalars • “%” – Modulo can be thought of as the remainder after a division function • “**” – Is shorthand for the exponential function • More mathematics like sin(), cos(), and tan() are available through extra libraries
String Operations • Perl is a very powerful tool for the manipulation of strings • “.” concatenates strings together • Strings surrounded in double quotes can automatically replace text for scalars in the string • String literals use single quotes and will not parse and replace anything in the string • There are a number of functions that will modify your string
User Input • Scalars can be used to store information added to the program after execution. This information can come from the user or from a file. For now we will focus on user input.
Arrays • Arrays are ordered sets of scalar variables • Arrays can mix numbers and strings in the set • Arrays use the “@” symbol when referencing them
Arrays (Cont.) • Starting with an array of 5 scalars • “shift” – takes off the first scalar • “pop” – takes off the last scalar • “push” adds a new scalar at the end of the array
2D Arrays • Members of an array can be an array • This is different to a 2D array in a language like C • These are known as “jagged arrays” • You can have higher dimensionality if needed
Conditionals • Programs often need to make decisions during execution • This is handled by the ideas of if/else • if statement is true -> do something • else -> do something different • This also introduces the idea of a Boolean true/false test • if ($name eq “Thad”) – this may or may not be true • if (9 > 0) – this is always true • if (9 < 0) – this is never true • if ($a + $b == 100) – perform the calculation then test against condition
Conditionals (Cont.) • Conditionals control the direction that the program follows • It is possible to have more than one truth statement in the if statement
Loops • Loops are used to repeat a section of code an arbitrary number of times. For example: I want to check each employee record in a company to see if the employee’s salary is > £50,000 • The classic example is the “for” loop, which starts at a number, is changed in some way at the end of each loop and continues while a conditional statement is true. • The “for” loop was improved into the “foreach” loops which will step through an array and offer the scalar as a variable for testing. • The final loop that we will encounter in this course is the “while” loop. The while loop uses a conditional statement to determine if the loop should execute.
For Loops • The “for” loop requires three conditions to execute • The start condition $i = 0 • The conditional statement $i < 20 • The step statement $i+= 1
Foreach Loops • Given a range or an array of items, the “foreach” loop will step through each item and offer a special scalar $_ containing the current item.
Foreach (Cont.) • Alternately to using $_ you can name the scalar to reduce confusion.
While Loops • The “while” loop checks a conditional statement before executing its code, once the end of the block has been reached it returns to the top of the code block and checks the conditional statement again.
Additional Help • When in doubt use Google or other search engine. There are millions of code examples, questions, and solutions on the web and there is no reason not to use them (with attribution if appropriate) • If you have a question that you can’t solve with Google then try asking the community at Stack Overflow • Or find a Perl community on the web, there are a great many places that a new programmer can go for help
Practical Session • Please make your way to the computers, you will need to set up your keyboards and the instructions are given in your course booklet. • Please feel free to ask questions • Go at your own pace • If you don’t finish in class: Perl is available for free at http://www.perl.org and is a simple install to get on to Windows • Linux and Mac should already have Perl included • Please feel free to email me questions during the week: aid@stats.ox.ac.uk
Introduction to PerlPart 2 Thaddeus Aid IT Learning Programme University of Oxford
Review • Any questions from last week?
Review • We have covered • Hello World! • Variables • $scalars • @arrays • Conditional Statements • if • elsif • else • Loops • for • foreach • while
This Session • Basic File Handling • File Reading (open, <) • File Writing (open, >) • Regular Expressions • Text searching and manipulation • Variables (Part 2) • %hashes
Reading a File • We will only be dealing with simple text file handling. • We will be using the open function and a few more in the book. • Redirection symbols are a legacy symbol from the Unix command line • < redirect input from a file • > redirect output to a file • >> redirect output to a file in append mode • open(FILEHANDLE, “Redirection Symbol”, “Filename”); • open(INPUT, “<“, “somedata.txt”);
Reading a File (Cont.) • Reading the entire file into an array for processing.
Reading a File (Cont.) • Reading a file one line at a time.
Reading a File (Cont.) • A more advanced example • Skipping header lines • Changing carriage return for new line • Splitting a line of input
Bringing it all together • As a recommendation, never write to the file that you have read from.
Regular Expressions • Regular Expressions are a very powerful way to manipulate text files. • Match – Find a substring in your string • Translate – Replace one set of letters with another • Substitute – A more powerful replacement command
Matching (Cont.) • A simple example of matching
Matching (Cont.) • A more complex matching example
Substitutions • A simple substitution example
Variables: Part 2 - Hashes • Hashes (known as associative arrays or maps in other languages) is the third basic data structure in Perl. • We store data in a hash using a key and value pair. • The key acts as an index into the hash where a value is stored. • In Perl, hashes are specified using the % symbol.
Hashes (Cont.) • Hashes are like an array but use a “key” instead of an “index”.
Hashes (Cont.) • An example of a hash table. • Getting all the keys can be done with the “keys” function. • You can loop through the keys array like any other array. • Hash keys are not sorted, because of this.
Hashes (Cont.) • “exists” checks to see if entry is set. • “sort” will sort an array.
A Real World Example • This is a program that I wrote to translate 22 file containing phylogenetic information about humans from one version of the genome to another version of the genome (hg18 -> hg19)
Exercises • Please do the exercises in Chapters 3-5. • Go at your own pace. • Please ask questions if you get stuck.
Introduction to PerlPart 3 Thaddeus Aid IT Learning Programme University of Oxford