310 likes | 445 Views
Perl Programming for Biologists, Second Edition Part 1: 9/11/2007. Yannick Pouliot, PhD Bioresearch Informationist Lane Medical Library & Knowledge Management Center. Class Requirements. You must have wireless access have the admin password to your machine. To Do.
E N D
Perl Programming for Biologists, Second EditionPart 1: 9/11/2007 Yannick Pouliot, PhD Bioresearch Informationist Lane Medical Library & Knowledge Management Center
Class Requirements • You must • have wireless access • have the admin password to your machine 2
To Do • Please download all class materials from http://lane.stanford.edu/howto/index.html?id=_2796 into C:\course 3
Class Focus • Creating, writing and reading Excel files • Reformatting data files for input to an analysis program • Writing and reading from a database such as MS Access or other locally installed relational database, as well as from databases available on the Internet And remember: Ask LOTS OF QUESTIONS 4
Cautions • All examples pertain to MS Office 2003 • Examples still work in MS Office 2007 • However, Perl modules used here do not work with MS Office 2007-formatted documents • All examples pertain to Perl 5.x, not 6.x • V.5 and 6 are NOT compatible • V.5 is far more common, so not much of an issue 5
So Why Perl? • Perl = Practical Extraction and Reporting Language • Free • Very widely used • Especially in biological community • Very flexible and portable • Not the only language of this type • E.g., Python • Not the absolute easiest • … but pretty easy • Not suited for everything • E.g., for ultra-fast mathematically-oriented code, C is still best 6
Today’s session: - Installing and understanding what is required to run Perl- Understanding the basics of a Perl program 7
Components to Install & Configure • Perl itself • More accurately, the Perl interpreter • We’ll use ActiveState Perl 5.8x (ActivePerl) • www.activestate.com/store/freedownload.aspx?prdGuid=81fbce82-6bd5-49bc-a915-08d58c2648ca • Additional Perl modules • Module = extra functions not part of the interpreter • Described at Comprehensive Perl Archive Network (CPAN) • Open Perl IDE • IDE = integrated development environment: • Editor to write/edit your program • Debugger to find bugs • A compiler/interpreter to run your program from within the IDE • sourceforge.net/project/showfiles.php?group_id=23334&release_id=91440 • Configuring the ODBC manager (next week) • Part of Windows • Allows different programs to interact with databases on your machine or anywhere on the Web via single “doorway” 9
What is an Interpreter? • = A program that translates an instruction into the computer’s language and executes it before proceeding to the next instruction • = compiled and executed once instruction at a time • Perl is usually used in interpreted mode • Can also be compiled once (= faster) 10
Installing Perl from ActiveState • Go to www.activestate.com/store/freedownload.aspx?prdGuid=81fbce82-6bd5-49bc-a915-08d58c2648ca We’ll be downloading Perl 5.8.x.x: • Select Windows MSI package for Windows X86 • Run the installer • Install under c:\Perl 11
The fountain of all things Perl: CPAN = Comprehensive Perl Archive Network http://www.cpan.org/ What does a module look like? Why modules? PPM for downloading & installing modules What modules are in MY Perl? Installing Additional Perl Modules 12
The PPM Module: Installing Perl Modules the Easy Way • Perl modules can downloaded and installed manually from CPAN (hard) • They can also be installed via the Perl Package Manager: PPM (easy) 14
Installing an environment to run and edit Perl: Integrated Development Environment (IDE) 15
Why an IDE? • IDE = integrated development environment: • Editor to write/edit your program • Debugger to find bugs • A “runner” (compiler/interpreter) to run your program from within the IDE • IDEs provide facilities to facilitate writing & debugging • E.g., automatic code highlighting • We’ll use Open Perl IDE • Free, open source, portable • sourceforge.net/project/showfiles.php?group_id=23334&release_id=91440 IDE: Definition, description • For our Mac friends: Affrus 16
Installing Open Perl IDE • Go tosourceforge.net/project/showfiles.php?group_id=23334&release_id=91440 and download the code 2. Create folder Program Files/OpenPerlIDE 3. Unzip into Program Files/OpenPerlIDE 4. Update Path (under System Properties, Advanced, Environment Variables, System Variables) → this makes it possible to run Open Perl IDE from anywhere on your machine… 17
BREAK 18
Example Short Program • Start Open Perl IDE • Load Simple1.pl • Run Simple1.pl 20
Learning by Example • Simple2.pl 21
Exploring Perl’s Major Language Elements • Norman Matloff’s introduction to Perl: http://heather.cs.ucdavis.edu/~matloff/Perl/PerlIntro.pdf • Perl language reference • http://en.wikipedia.org/wiki/Perl#Data_types 22
Additional Key Books/Resources • Learning by example: Perl Cookbook • Perl Programming for Biologists • Perl Quick Reference Guide • My favorite: Perl Quick Reference 23
Going Further: Programming Tips • Plan your program • Write down how you intend to process the data in more-or-less plain language • Goal: making sure that it really does make sense • Hacking doesn’t really pay… • Have documentation handy • ActivePerl documentation (searchable) • Perl language reference → eBooks: help served on a silver platter • Lane FAQs • When you’re stuck: Search the Web • Google can answer almost any programming question • … though quality documentation is still best 24
Excel3.pl: Introducing Object Programming • Purpose: From an Excel worksheet that lists public identifiers for DNA sequences associated with genes, the program retrieves: • UniGene cluster ID • Gene symbol • NCBI Gene ID • … and writes the result into another Excel worksheet • Mix of procedural and object programming • Relevant links: • http://www.ncbi.nlm.nih.gov/sites/entrez?db=unigene&orig_db=unigene • Entrez Utilities 26
Assignments • Look at code for Example3.pl • Modify it, break it • Write down at least one question so we can talk about it next week 28
eBooks Rule 30