430 likes | 583 Views
Perl/DBI - accessing databases from Perl. Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk http://www.bioinf.org.uk/. Aims and objectives. Understand the need to access databases from Perl Know why DBI? Understand the structure of DBI
E N D
Perl/DBI - accessing databases from Perl Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk http://www.bioinf.org.uk/
Aims and objectives • Understand the need to access databases from Perl • Know why DBI? • Understand the structure of DBI • Be able to write a Perl/DBI script to read from or write to a database • PRACTICAL: write a script to read from a database
Send request for page to web server Pages RDBMS CGI Script External Programs Why access a database from Perl? Web browser Web server CGI can extract parameters sent with the page request
Why access a database from Perl? Populating databases • Need to pre-process and re-format data into SQL • Intermediate storage during data processing Reading databases • Good database design can lead to complex queries (wrappers) • Need to extract data to process it
Why Perl/DBI? Many relational databases available. • Commercial examples: • Oracle • DB/2 • SQLServer • Sybase • Informix • Interbase • Open source examples: • PostgreSQL • mySQL
Why Perl/DBI? • Databases use a common query language: • ‘structured query language’ (SQL) • Queries can easily be ported between different database software • Minor variations in more advanced features • Proprietary extensions
Why Perl/DBI? • Can call command-line interface from within your program. $result = `psql -tqc “SELECT * FROM table” `; @tuples = split(/\n/, $result); foreach $tuple (@tuples) { @fields = split(/\|/, $tuple); } Inefficient: new process for each database access
Why Perl/DBI? • Databases generally provide own APIs to allow access from programming languages • e.g. C, Java, Perl • Proprietary APIs all differ • Very difficult to port software between databases • Standardized APIs have thus become available
Why Perl/DBI? • Perl/DBI is the standardized API for Perl • Easy to port Perl scripts from one database to another
DBI and ODBC • ODBC (Open DataBase Connectivity) • Consortium of vendors in early 1990s • SQL Access Group • October 1992 & 1993, draft standard: • ‘Call Level Interface’ (‘CLI’ - an API) • Never really adopted • Microsoft‘embraced and extended’ it to create ODBC
DBI and ODBC • ‘DBPerl’ designed as database interface specifically for Perl4 • September 1992 (i.e. pre-ODBC) • Just before release, Perl5 announced with OO facilities • DBPerl modified to support OO and loosely modelled on CLI standard • This became DBI
Standard SQL syntax Dodged this issue! Standard error codes Check $DBI::err or $DBI::errstr (DBI provides methods for standard errors, but drivers don’t use them) Many attributes and options to tweak underlying driver Very limited control Meta-data on database structure Tables and types only DBI and ODBC ODBC DBI
DBI and ODBC • There is an ODBC driver for DBI • DBI can be used to access any ODBC database
Oracle driver mySQL driver PostgreSQL driver Sybase driver DBI Architecture DBI
DBI Architecture • Multi-layer design Perl script Database Independent DBI DBD Database API Database Dependent RDBMS
DBI Architecture Oracle DBD::Oracle DBD::Oracle mySQL Perl Script DBI Perl Script Perl Script DBD::mysql DBD::Oracle DBD::Pg DBD::Oracle PostgreSQL
DBI Architecture Returns a list of installed (DBD) drivers @drivers = DBI->available_drivers();
DBI Architecture DBD::Oracle Driver Handle DBD::pg Driver Handle Database Handle Database Handle Database Handle Statement Handle Statement Handle Statement Handle Statement Handle Statement Handle Statement Handle Statement Handle
DBI Architecture Driver Handles • References loaded driver(s) • One per driver • Not normally referenced in programs • Standard variable name $drh
DBI Architecture Database Handles • Created by connecting to a database • References a database via a driver • Can have many per database (e.g. accessing different user accounts) • Can access multiple databases • Standard variable name $dbh $dbh = DBI->connect($datasource, ... );
DBI Architecture Statement Handles • Created by ‘preparing’ some SQL • Can have many per database handle (e.g. multiple queries) • Standard variable name $sth $sth = $dbh->prepare($sql);
DBI Architecture DBD::pg Driver Handle Database Handle $dbh = DBI->connect($datasource, ... ); Statement Handle $sth = $dbh->prepare($sql);
Parse Statement Pass statement to database engine Pass statement to database engine if valid Encapsulate as DBI statement handle Encapsulate Statement execute() Maintain cursor into results SQL Preparation Perl script DBI and Driver Database prepare() $sth Execute Statement fetchrow_array()
Create a ‘handle’ to access the database: $dbh = DBI->connect($datasource, $username, $password); The DBD module and d/b to be used Accessing DBI from Perl • Must have the DBI package and appropriate DBD package installed • DBD::Oracle, DBD::Pg, DBD::mysql, etc. use DBI;
Format varies with database module Optional; supported by some databases. Default: local machine and default port. Data sources $dbh = DBI->connect($datasource, $username, $password); $dbname = “mydatabase”; $dbserver = “dbserver.cryst.bbk.ac.uk”; $dbport = 5432; $datasource = “dbi:Oracle:$dbname”; $datasource = “dbi:mysql:database=$dbname;host=$dbserver”; $datasource = “dbi:Pg:dbname=$dbname;host=$dbserver;port=$dbport”;
Username and password $dbh = DBI->connect($datasource, $username, $password); • $username and $password also optional • Only needed if you normally need a username/password to connect to the database. • Remember CGI scripts run as a special web-server user. • Generally, ‘nobody’ or ‘apache’. • Database must allow access by this user • or specify a different username/password
SQL commands with no return value • SQL commands other that SELECT don’t return values • may return success/failure flag • number of entries in the database affected • For example: • creating a table • inserting a row • modifying a row
SQL commands with no return value • e.g. insert a row into a table: • From Perl/DBI: INSERT INTO idac VALUES (‘LYC_CHICK’, ‘P00698’) $sql = “INSERT INTO idac VALUES (‘LYC_CHICK’, ‘P00698’)”; $dbh->do($sql);
Columns placed in an array • Could also have been placed in a list: ($id, $ac) = $dbh->selectrow_array($sql); SQL commands that return a single row • Sometimes, can guarantee that a database query will return only one row • or you are only interested in the first row $sql = “SELECT * FROM idac WHERE ac = ‘P00698’”; @values = $dbh->selectrow_array($sql);
SQL commands that return a multiple rows • Most SELECT statements will return many rows • Three stages must be performed: • preparing the SQL • executing it • extracting the results
SQL commands that return a multiple rows $sql = “SELECT * FROM idac”; $sth = $dbh->prepare($sql); if($sth->execute) { while(($id, $ac) = $sth->fetchrow_array) { print “ID: $id AC: $ac\n”; } } (Can also obtain array or hash reference) NB: statement handle / fetchrow_array rather than db handle / selectrow_array
SQL commands that return a multiple rows • If you need to stop early you can do: $sql = “SELECT * FROM idac”; $sth = $dbh->prepare($sql); if($sth->execute) { for($i=0; $i<10; $i++) { if(($id, $ac) = $sth->fetchrow_array) { print “ID: $id AC: $ac\n”; } } $sth->finish; }
SQL commands that return a multiple rows • A utility method is also available to print a complete result set: $sql = “SELECT * FROM idac”; $sth = $dbh->prepare($sql); if($sth->execute) { $nrows = $sth->dump_results; } (Mostly useful for debugging)
Repeated SQL calls • Often want to repeat essentially the same query, but with some different value being checked. • For example: foreach $ac (‘P00698’, ‘P00703’) { $sql = “SELECT * FROM idac WHERE ac = ‘$ac’”; @values = $dbh->selectrow_array($sql); print “@values\n”; } (using special option for 1-row returns)
Repeated SQL calls • Could also be do: foreach $ac (‘P00698’, ‘P00703’) { $sql = “SELECT * FROM idac WHERE ac = ‘$ac’”; $sth = $dbh->prepare($sql); $sth->execute; while(@values = $sth->fetchrow_array) { print “@values\n”; } } i.e. don’t use special option for 1-row returns $dbh->selectrow_array($sql)
Parameter Number Variable to bind Repeated SQL calls • Increase in performance by ‘binding’ a variable: $sql = “SELECT * FROM idac WHERE ac = ?”; $sth = $dbh->prepare($sql); foreach $ac (‘P00698’, ‘P00703’) { $sth->bind_param(1, $ac); $sth->execute; while(@values = $sth->fetchrow_array) { print “@values\n”; } }
Repeated SQL calls NOTE: • Performance increase depends on database and driver • Although strings normally enclosed in single inverted commas, the bound variable is not quoted. • If you have a number which you need to be treated as a string, then you do: $sth->bind_param(1, 42, SQL_VARCHAR);
Summary • DBI provides a standard API • It does not standardize the SQL • DBI is an older standard than ODBC • They can be used together and they are both evolving • Basic 3-step process: • prepare / execute / fetch • Shortcut calls for no return or 1-row return • Many other functions available