240 likes | 323 Views
Chapter 16 Web Pages And CGI Scripts. Department of Biomedical Informatics University of Pittsburgh School of Medicine http://www.dbmi.pitt.edu. The Content of this Lecture. Python w eb p rogramming basics : 1 ) urllib2 module 2 ) cgi module
E N D
Chapter 16 Web Pages And CGI Scripts Department of Biomedical Informatics University of Pittsburgh School of Medicine http://www.dbmi.pitt.edu
The Content of this Lecture • Python web programming basics: 1) urllib2 module 2) cgi module 3) information for building your own web page at Pitt 2. Book tasks: 1) Grabing web pages 2) A simple CGI script for searching the neoplasm classification
urllib2 module The urllib2 module defines functions and classes which help in opening URLs. urllib2.Request() url should be a string containing a valid URL. urllib2.urlopen() The input of this function can be either a url string or a request object.
Examples >>> import urllib2 >>> req=urllib2.Request(url='http://www.python.org/') >>> f=urllib2.urlopen(req) >>> print f.read(100) <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm >>>
Examples >>> f=urllib2.urlopen('http://faculty.dbmi.pitt.edu/day/Bioinf2012') >>> print f.read(50) <html> <head> <title>Index of /day/Bioinf2012/< >>>
urllib2 module exceptions urllib2.URLError.The handlers raise this exception when they run into a problem. It is a subclass of IOError. urllib2.HTTPError It is a subclass of URLError.
Algorithm for Grabbing Web Pages • Import the module that makes the HTTP requests (urllib2). • Make the HTTP request. • If the request returns the Web page, print the page. Otherwise, print an error message.
The cgi Module A CGI (Common Gateway Interface) script is invoked by an HTTP server, usually to process user input submitted through an HTML <FORM>. The CGI module is used to implement CGI scripts. The contents of an HTML form are passed to a CGI program via a string which are accessed using the FieldStorage class of the CGI module.
The FieldStorage Class getvalue() is of member function of the the FieldStorage Class. This function returns the value of a given field with the fieldname as the input. For example: form = cgi.FieldStorage() message = form.getvalue("tx", "(no message)")
The cgitb mudule This module provides an exception handler that displays a detailed report. It is a good practice to include the follow statements when you develop a new cgi script: import cgitb cgitb.enable()
Information for Building Your Own Web Page at Pitt You can build your own Web site at Pitt. The instructions for doing this are contained in the following two files: afs_web.pdf Html_inst.pdf These two files were uploaded to our website.
A simple CGI script for searching the neoplasm classification • Create a very simple Web page consisting of a simple form (see the figure below). This form contains a text box and a “ submit” button for taking user input. The source code of this form should contain the URL (Universal Resource Locator, or Web address) linked to the folder (cgi-bin) where your CGI script can be found.
<form name="sender" method="GET" action="http://gweel.dbmi.pitt.edu/cgi-bin/neopull.py"> <br><center><input type="text" name="tx" size=38 maxlength=48 value=""> <input type="submit" name="bx" value="SUBMIT"></center> </form>
A simple CGI script for searching the neoplasm classification • Upload the your Web page to a folder on a webserver, designated for this class. This folder will be publically accessible through the Web. • Create a script and upload it to another folder called cgi-bin on the webserver. The cgi-bin folder is also designated for the class. The web address of this folder will be the same as the address on the HTML source code of your Web page. In my example, it should be: "http://gweel.dbmi.pitt.edu/cgi-bin/neopull.py"
A simple CGI script for searching the neoplasm classification • The algorithm of the script you will upload in step 3 starts at step 4. Capture the string sent by the user through your web page. Place the text into a string object (message).
form = cgi.FieldStorage() message = form.getvalue("tx", "(no message)") term_check = re.search(r'[A-Za-z ]+$', message) if not term_check: print "<br>Only alphabetic letters and spaces are permitted in the query box" print "</body></html>" sys.exit()
A simple CGI script for searching the neoplasm classification • Print out the HTML header of the web page that will be returned to the user. print"<br>Your query term is " + message + "<br>"
A simple CGI script for searching the neoplasm classification • Open up a file called “neoself” which contains the neoplasm classification information. 7. Go through each line of this file to look for the terms that match the “message” entered by the users. When a matched term is found, print the line that contained the term. 8. Print a line to acknowledge the user the work is done.
in_text = open("../data/neoself", "r") for line in in_text: query_match = re.search(message, line) if query_match: line = re.sub(r'\|',"<br>", line) print "<br>" + line + "<br>" exit