670 likes | 777 Views
Remote Procedure Calling. Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk http://www.bioinf.org.uk/. Aims and objectives. Understand the concepts of remote procedure calling and web services To be able to describe different methods of remote procedure calls
E N D
Remote Procedure Calling Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk http://www.bioinf.org.uk/
Aims and objectives • Understand the concepts of remote procedure calling and web services • To be able to describe different methods of remote procedure calls • Understand the problems of ‘screen scraping’ • Know how to write code using LWP and SOAP
What is RPC? RPC Web Service Network A network accessible interface to application functionality using standard Internet technologies
Why do RPC? • distribute the load between computers • access to other people's methods • access to the latest data
Ways of performing RPC • screen scraping • simple cgi scripts (REST) • custom code to work across networks • standardized methods • (e.g. CORBA, SOAP, XML-RPC)
Web services • RPC methods which work across the internet are often called “Web Services” • Web Services can also • be self-describing (WSDL) • provide methods for discovery (UDDI)
Web service Web service Network
Screen scraping Extracting content from a web page Screen scraper Web server Network
Partial data Semantics lost Errors in data extraction Visual markup Partial (error- prone) data Fragile procedure... Data Extract data Web page Provider Consumer Data Extract data
Fragile procedure... • Trying to interpret semanticsfrom display-based markup • If the presentation changes,the screen scraper breaks
Send request for page to web server Pages RDBMS CGI Script External Programs Web servers… Web browser Web server
Screen scraping Straightforward in Perl • Perl LWP module • easy to write a web client • Pattern matching and string handling routines
Example scraper… • A program for secondary structure prediction • Want a program that: • specifies an amino acid sequence • provides a secondary structure prediction
Example scraper... #!/usr/bin/perl -w use LWP::UserAgent; use strict; my($seq, $ss); $seq = "KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDY GILQINSRWWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNR CKGTDVQAWIRGCRL"; if(($ss = PredictSS($seq)) ne "") { print "$seq\n"; print "$ss\n"; }
Example scraper… • NNPREDICT web server at http://alexander.compbio.ucsf.edu/~nomi/nnpredict.html http://alexander.compbio.ucsf.edu/cgi-bin/nnpredict.pl
Example scraper… • Program must: • connect to web server • submit the sequence • obtain the results and extract data • Examine the source for the page…
<form method="POST” action="http://alexander.compbio.ucsf.edu/cgi-bin/nnpredict.pl"> <b>Tertiary structure class:</b> <input TYPE="radio" NAME="option" VALUE="none" CHECKED> none <input TYPE="radio" NAME="option" VALUE="all-alpha"> all-alpha <input TYPE="radio" NAME="option" VALUE="all-beta"> all-beta <input TYPE="radio" NAME="option" VALUE="alpha/beta"> alpha/beta <b>Name of sequence</b> <input name="name" size="70"> <b>Sequence</b> <textarea name="text" rows=14 cols=70></textarea> </form>
Example scraper… • option 'none', 'all-alpha', 'all-beta', or 'alpha/beta’ • name optional name for the sequence • text the sequence
Create a LWP-based connection; Create the post request; Connect and get the returned page If behind a firewall CGI script to access Values passed to CGI script Example server... sub PredictSS { my($seq) = @_; my($url, $post, $webproxy, $ua, $req, $result, $ss); # $webproxy = 'http://user:password@wwwcache.rdg.ac.uk:8080'; $webproxy = ""; $url = "http://alexander.compbio.ucsf.edu/cgi-bin/nnpredict.pl"; $post = "option=none&name=&text=$seq"; $ua = CreateUserAgent($webproxy); $req = CreatePostRequest($url, $post); $result = GetContent($ua, $req); if(defined($result)) { $ss = GetSS($result); return($ss); } else { print STDERR "connection failed\n"; } return(""); }
<HTML><HEAD> <TITLE>NNPREDICT RESULTS</TITLE> </HEAD> <BODY bgcolor="F0F0F0"> <h1 align=center>Results of nnpredict query</h1> <p><b>Tertiary structure class:</b> alpha/beta <p><b>Sequence</b>:<br> <tt> MRSLLILVLCFLPLAALGKVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQA<br> TNRNTDGSTDYGILQINSRWWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDG<br> NGMNAWVAWRNRCKGTDVQAWIRGCRL<br> </tt> <p><b>Secondary structure prediction <i>(H = helix, E = strand, - = no prediction)</i>:<br></b> <tt> ----EEEEEEE-H---H--EE-HHHHHHHHHH--------------HHHHHH--------<br> ------------HHHHE-------------------------------HH-----EE---<br> ---HHHHHHH--------HHHHHHH--<br> </tt> </body></html>
Remove return characters Match the last <tt>...</tt> Grab the text within <tt> tags Remove the <br> tags Example server… sub GetSS { my($html) = @_; my($ss); $html =~ s/\n//g; $html =~ /^.*<tt>(.*)<\/tt>.*$/; $ss = $1; $ss =~ s/\<br\>//g; return($ss); } If authors changed presentation of results, this might break!
Wrappers to LWP • CreateUserAgent() • CreatePostRequest() • GetContent() • CreateGetRequest()
Pros and cons Advantages • 'service provider' doesn’t do anything special Disadvantages • screen scraper will break if format changes • may be difficult to determine semantic content
Simple CGI scriptsREST:Representational State Transferhttp://en.wikipedia.org/wiki/REST
Simple CGI scripts Extension of screen scraping • relies on service provider to provide a script designed specifically for remote access Client identical to screen scraper • but guaranteed that the data will be parsable (plain text or XML)
Simple CGI scripts Server's point of view • provide a modified CGI script which returns plain text • May be an option given to the CGI script
Simple CGI scripts • 'Entrez programming utilities' http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html • I have provided a script you can try to extract papers from PubMed
Simple CGI scripts • Search using EUtils is performed in 2 stages: • specified search string returns a set of PubMed Ids • fetch the results for each of these PubMed IDs in turn.
Custom code • Generally used to distribute tasks on a local network • Code is complex • low-level OS calls • sample on the web
Custom code • Link relies on IP address and a 'port’ • Ports tied to a particular service • port 80 : HTTP • port 22 : ssh • See /etc/services
Custom code • Generally a client/server model: • server listens for messages • client makes requests of the server Client Server request message response message
Custom code: server • Server creates a 'socket' and 'binds' it to a port • Listens for a connection from clients • Responds to messages received • Replies with information sent back to the client
Custom code: client • Client creates a socket • Binds it to a port and the IP address of the server • Sends data to the server • Waits for a response • Does something with the returned data
Standardized methods • Various methods. e.g. • CORBA • XML-RPC • SOAP • Will now concentrate on SOAP...
Advantages of SOAP Web service Application client Application code Platform and language specific code Platform and language independent code
Advantages of SOAP Application XML message Application Information encoded in XML • Language independent • All data are transmitted as simple text
HTTP response SOAP response Advantages of SOAP HTTP post SOAP request • Normally uses HTTP for transport • Firewalls allow access to the HTTP protocol • Same systems will allow SOAP access
Advantages of SOAP • W3C standard • Libraries available for many programming languages
XML encoding Which of these is correct? <phoneNumber>01234 567890</phoneNumber> <phoneNumber> <areaCode>01234</areaCode> <number>567890</number> </phoneNumber> <phoneNumber areaCode='01234' number='567890' /> <phoneNumber areaCode='01234'>567890</phoneNumber>
Defined by SOAP message format Defined by various transport protocols • Type of data being exchanged • How it will be expressed in XML • How the information will be exchanged SOAP XML encoding Must define a standard way of encoding data: • Type of data being exchanged • How it will be expressed in XML • How the information will be exchanged
SOAP messages SOAP Envelope SOAP Header (optional) Header block Header block SOAP Body Message body
SOAP Envelope Header block SOAP Header Message body SOAP Body <s:Envelope xmlns:s=”http://www.w3.org/2001/06/soap-envelope”> <s:Header> <m:transaction xmlns:m=”soap-transaction” s:mustUnderstand=”true”> <transactionID>1234</transactionID> </m:transaction> </s:Header> <s:Body> <n:predictSS xmlns:n=”urn:SequenceData”> <sequence id='P01234'> SARTASCWIPLKNMNTYTRSFGHSGHRPLKMNSGDGAAREST </sequence> </n:predictSS> </s:Body> </s:Envelope>
Example SOAP message Header block • Specifies data must be handled as a single 'transaction’ Message body • contains a sequence simply encoded in XML • Perfectly legal, but more common to use special RPC encoding
The RPC ideal Ideal situation: $ss = PredictSS($id, $sequence); Client Server request message response message
Subroutine calls Only important factors • the type of the variables • the order in which they are handed to the subroutine
SOAP type encoding SOAP provides standard encoding for variable types: • integers • floats • strings • arrays • hashes • structures • …