250 likes | 375 Views
Welcome to lecture 5: Object – Oriented Programming in Perl. IGERT – Sponsored Bioinformatics Workshop Series Michael Janis and Max Kopelevich, Ph.D. Dept. of Chemistry & Biochemistry, UCLA. We’ve been cruising! (Whew!).
E N D
Welcome to lecture 5:Object – Oriented Programming in Perl IGERT – Sponsored Bioinformatics Workshop Series Michael Janis and Max Kopelevich, Ph.D. Dept. of Chemistry & Biochemistry, UCLA
We’ve been cruising!(Whew!) • Two weeks to go… after this lecture – we’ll have a bit of payoff : now we’ll start using some of our knowledge!!! (Don’t give up!!!!!!!!)
Last time… • We covered a bit of material… • Try to keep up with the reading – it’s all in there! • We’ve covered subroutines and modules… • Now we’ll cover OOP in perl • We’ll create classes of our own to use • We’ll take our previous example of a biological problem (gene finding) and see how OOP can help us… • We are preparing for the introduction to BIOPERL next week – which is object-oriented!
We have been dealing with increasingly complex data structures A result is that we always need to be concerned about the data state Let’s turn this around and see if there is a better way to think about biological data • We’d like to keep the data redundancy to a minimum (only one strand needed?) • Reduces errors • Reduces space • Easier to maintain / update the information (just need a function) • Increasing our data means increasing the complexity of our data structures • We might like to define ways of interacting with the data; an API- like approach (like we saw with our fasta file handler example…) • We could then concentrate on *what to do* with the data rather than *how to get at* the data…
Reversing the way we think about data (and the way we program) A DATA Structure type • Access the data • Do something • Put the data into • a structure function • Retrieve the data Another DATA structure • Update… • Need to do something else? function function DATA DATA function function
Maybe we can tie it all together • Can be a data type of our • Own creation • (beyond scalars, arrays, hashes) DATA AND FUNCTIONS • Retrieve the data • Access the data via • functions associated with • the data Interface remains constant We don’t have to worry about The code (modPerl again!) • AND… • Can have different data types stored together (hashes, scalars arrays) • We treat all of the data types and functions together as a NEW DATA TYPE • We can use the new data type in ANY type of data structure we wish to build
Defining our own data types • Interactions • Homologies • Enrichment • Pattern recognition • Sequence data • Ontology data • Promoter regions data • Expression analysis data DATA AND FUNCTIONS • An entire genome with annotation, microarray correlations, and built in • functions for positional analysis, pathway analysis, … • Our new data type can be serialized into any other data structure • (like an array of genomes, each with the same functionality possible) [0] [1] [2] [3]
This is what OOP promises! We start thinking about the functionality of our data It’s another layer of abstraction, but it makes our lives easier as programmers… WHAT IS AN OBJECT?: • Data structure bundled with functions to set, access, and process the data structure • A new data type! • Rigorous definitions of organizing code • All (most) interactions are defined and must obey certain rules • A module (as a class) • Instead of importing a series of subroutines that are called directly, these modules define a series of object types that you can create and use. • A level of abstraction – data that logically belongs together • That lets us focus on using the object
Perl Object Syntax Perl objects are special references that come bundled with a set of functions that know how to act on the contents of the reference. • For example, there may be a Sequence class definition. • Internally, the Sequence object is an instance of the Sequence class definition • It’s a hash reference that has keys that point to the DNA string, the name and source of the sequence, and other attributes. • The object is bundled with functions that know how to manipulate the sequence, such as revcom(), translate(), subseq(), etc.
Perl Object Syntax When talking about objects, the bundled functions are known as methods. This terminology derives from the grandaddy of all object-oriented languages, Smalltalk. • You invoke a method using the -> operator, a syntax that looks a lot like getting at the value that a reference points to. • For example, if we have a Sequence object stored in the scalar variable $sequence, we can call its methods like this: $reverse_complement = $sequence->revcom(); $first_10_bases = $sequence->subseq(1,10); $protein = $sequence->translate;
Parts of an object • An empty class (data definition) (a class is a package) – modular! Method (function) Attributes (parts of The data str., usually as keys for a hash) # interaction with code # (main part of program here) # $obj=new $class{}; # $obj->method(); Method (function) We create a scalar reference using a method called “new” The scalar “obj” is of type class; it is a new instance of our class
We already know how to build a class; we’ll use our key/value pairs and subroutines • An empty class (data definition) (a class is a package) – modular! We create an instance of the class, as a reference to an anonymous hash* we define The class definition is a package (module) package exClass; use strict; use warnings; sub new { my ($class, %arg)=@_; return bless { _name => $arg{accession} _organism => $arg{organism}, } $class; } sub get_name { $_[0]->{_name} } sub get_organism {$_[0]->{_organism} } 1; We pass arguments as @_ Our methods are just subroutines (we’ve seen before)
Building a class attributes Sets the parts of the internal data structure • For example • Name of the organism • DNA sequence • Exon/Intron boundaries • … • Passed as a hash list of arguments • Instantiated using a method (subroutine) – a constructor (we usually call it “new”) • By convention, internal to the class and preceded with “_” to denote this • Should only access class (object) data through methods!!! sub new { my ($class, %arg)=@_; return bless { _name => $arg{accession} _organism => $arg{organism}, } $class; }
Building a class constructor The method that sets the attributes • For example • Name of the organism • DNA sequence • Exon/Intron boundaries • … • Initializes an object • Marks the object as a member of the class (an ‘instance’ of the class definition) sub new { my ($class, %arg)=@_; return bless { _name => $arg{accession} _organism => $arg{organism}, } $class; } We pass arguments as @_ ; the class name is automatically passed as the first agument; our hash of arguments follows
Building a class bless Creates an object of the class definition from a given data structure (usually a hash) • Takes two arguments: • An anonymous hash (a reference to an unnamed hash) • The name of the class for which the object will be marked • We return this to a scalar variable which is a reference to the object.
Building a class accessors Methods (subroutines) which return values of the class attributes (attribute / values ; key/value pairs in our hash) sub get_name { $_[0]->{_name} } my $species=$obj->get_name; We pass arguments as @_ ; the first argument is therefore the object The call to the object accessor method
Building a class mutators Methods (subroutines) which change or update values of the class attributes (attribute / values ; key/value pairs in our hash)
Using Objects Before you can start using objects, you must load their definitions from the appropriate module(s). • This is just like loading subroutines from modules; • you use the use statement in both cases. • For example, if we want to load our “exClass” Class definitions, we load the appropriate module, which in this case is called exClass (or lib::exClass, or whatever file hierarchy you’ve imposed). use exClass; • Now you'll probably want to create a new object. • There are a variety of ways to do this, and details vary from module to module, but most modules, including ours, do it using the new() method: use exClass; my $obj=exClass->new( accession => “AC00243”, organism => “Homo Sapiens”, );
Passing Arguments to Methods When you call object methods, you can pass a list of arguments, just as you would to a regular function. • We’ve seen this a number of times; for example, using the substr function. • As methods get more complex, argument lists can get quite long and have possibly dozens of optional arguments. To make this manageable, many object-oriented modules use a named parameter style of argument passing, that looks like this: • my $result = $object->method(-arg1=>$value1,-arg2=>$value2,-arg3=>$value3) • We utilize the (->) arrow notation: • We saw this with references • Used on an object to call a method in the class • Perl automatically passes the first argument to the method
We already know how to use a class; references and arguments • An empty class (data definition) (a class is a package) – modular! We create an instance of the class, as a reference to an anonymous hash* we define The class definition is a package (module) in a location #!/usr/bin/perl –w use strict; use lib”/home/mako/devel/lib”; use exClass; my $obj=exClass->new(accession => “AC00243”, organism => “Homo Sapiens”, ); my $species=$obj->get_name; We pass arguments to the subroutine new An accessor method (subroutine)
Passing Arguments to Methods; a bioperl example As a practical example, Bio::PrimarySeq->new() actually takes multiple optional arguments that allow you to specify the alphabet, the source of the sequence, and so forth. Rather than create a humungous argument list which forces you to remember the correct position of each argument, Bio::PrimarySeq lets you create a new Sequence this way: use Bio::PrimarySeq; my $sequence = Bio::PrimarySeq->new(-seq => 'gattcgattccaaggttccaaa', -id => 'oligo23', -alphabet => 'dna', -is_circular => 0, -accession_number => 'X123' );
Perl Object Syntax Don't be put off by this syntax! • $sequence is really just a hash reference! • you can get its keys using keys %$sequence • you can look at the contents of the "_seq_length" key by using $sequence->{_seq_length}, and so forth. • the syntax $sequence->translate is just a fancy way of writing translate($sequence), except that the object knows what module the translate() function is defined in.
Back to our task (geneFinding) I’ve re-written the fasta file reader and some associated functions as part of our gene finding programming exercise in object –modular perl
Back to our task (geneFinding) I’ve re-written the fasta file reader and some associated functions as part of our gene finding programming exercise in object –modular perl • It’s on the website (http://www.chem.ucla.edu/~mjanis/readFasta.pm) • Three tasks: • Create a file hierarchy for storage of your library files and implement my code (using the example code for using the class readFasta) • Comment the readFasta class at every line, describing what each component of the class (constructor, accessors, etc.) are doing • The comments I’ve made in the code point out that the code is actually unfinished, although it will run as is. Complete the code and adapt the existing gene finding subroutines we’ve used as methods for the class readFasta