520 likes | 698 Views
Designing and Implementing Web Data Services in Perl. Michael McClennen. Server. Data Store. Request. Client. Response. What is "REST" ?. REST is a set of architectural principles for the World Wide Web Developed by Roy Fielding, one of the Web's principal architects
E N D
Designing and Implementing Web Data Services in Perl Michael McClennen
Server Data Store Request Client Response
What is "REST" ? • REST is a set of architectural principles for the World Wide Web • Developed by Roy Fielding, one of the Web's principal architects • Stands for "REpresentational State Transfer" • No consensus about exactly what it means in practice
REST: original principles • Separation of client and server by a uniform interface • Intermediate servers (i.e. proxies or caches) may be interposed arbitrarily • All client-server interactions are stateless • Data is composed of resources, each identified by a URI • Server sends a representation of a resource • Clients can manipulate the resource by means of the representation • Representations are self-describing • Client state transitions depend upon information embedded in representations (HATEOAS)
REST: in practice • One protocol layer, generally HTTP • no extra layers (such as SOAP) on top of it • headers and status codes are used as designed • Resources are identified by URIs • individual resources • all resources matching particular criteria • Client-server interactions are stateless • with the possible exception of authentication
Server Data Store Web Data Service (API) HTTP Request HTTP Request Query Result Client Operation Result HTTP Response HTTP Response
Web Data Service (API) • Parse HTTP requests • Validate parameters • Talk to the backend data store • Assemble representations of data • Serialize representations in JSON, XML, … • Set HTTP response headers • Generate appropriate error messages • Provide documentation about itself
What makes a good Web Data Service, from the point of view of the USER?
Well designed Well documented Flexible Consistent Responsive
Example: Wikipedia API http://en.wikipedia.org/w/api.php?action=query&list=allpages&apfrom=Perl&aplimit=50&format=json “ List 50 pages whose title starts with ‘Perl’, in JSON format ”
Example: Wikipedia API http://en.wikipedia.org/w/api.php?Base URL action=querySpecify type of operation list=allpagesSpecify operation apfrom=PerlQuery parameter aplimit=50Specify size of result set format=jsonSpecify result format Execute
Example: Wikipedia API http://en.wikipedia.org/w/api.php?Base URL action=querySpecify type of operation list=allpagesSpecify operation apfrom=PerlQuery parameter aplimit=50Specify size of result set format=xmlSpecify result format Execute
Example: Wikipedia API http://en.wikipedia.org/w/api.php?Base URL action=querySpecify type of operation list=allpagesSpecify operation apfrom=PerlQuery parameter aplimit=5Specify size of result set format=xmlSpecify result format Execute
Example: Wikipedia API http://en.wikipedia.org/w/api.php?Base URL action=querySpecify type of operation list=foobarSpecify operation apfrom=PerlQuery parameter aplimit=5Specify size of result set format=xmlSpecify result format Execute
Example: Wikipedia API http://en.wikipedia.org/w/api.php?Base URL action=querySpecify type of operation list=foobarSpecify operation apfrom=PerlQuery parameter aplimit=5Specify size of result set format=jsonSpecify result format Execute
Example: Wikipedia API http://en.wikipedia.org/w/api.php?Base URL action=querySpecify type of operation list=allpagesSpecify operation apfrom=PerlQuery parameter aplimit=5Specify size of result set format=jsonSpecify result format foo=bar*Bad parameter* Execute
Example: Wikipedia API http://en.wikipedia.org/w/api.phpBase URL only Execute
Example: Google Feed API https://ajax.googleapis.com/ajax/services/feed/find?v=1.0&q=Perl “ List all feeds whose title contains ‘Perl’ ”
Example: Google Feed API https://ajax.googleapis.com/ajax/services/Base URL feed/find?Specify operation q=PerlQuery parameter v=1.0Protocol version Execute
Example: Google Feed API https://ajax.googleapis.com/ajax/services/feed/load?v=1.0&q=http://www.perl.com/pub/atom.xml&num=10 “ Show the most recent 10 entries from the feed http://www.perl.com/pub/atom.xml ”
Example: Google Feed API https://ajax.googleapis.com/ajax/services/Base URL feed/load?Specify operation q=http://www.perl.com/pub/atom.xmlQuery parameter v=1.0Protocol version num=10 Size of result set Execute
Example: Google Feed API https://ajax.googleapis.com/ajax/services/Base URL feed/load?Specify operation q=http://www.perl.com/pub/atom.xmlQuery parameter v=1.0Protocol version num=NOMNOMNOM * bad value * Execute
Example: Google Feed API https://ajax.googleapis.com/ajax/services/Base URL feed/load?Specify operation q=http://www.perl.com/pub/atom.xmlQuery parameter v=1.0Protocol version numm=10 * bad parameter * Execute
Example: Google Feed API https://ajax.googleapis.com/ajax/services/Base URL feed/load?Specify operation q=http://www.perl.com/pub/atom.xmlQuery parameter * missing version * Execute
Example: Google Feed API https://ajax.googleapis.com/ajax/services/Base URL Execute
Example: Google Feed API Documentation is at: http://developers.google.com/feed/v1/jsondevguide Execute
What makes a good Web Data Service CODEBASE, From the point of view of the programmer?
Easy to implement Easy to document Easy to maintain Low overhead
Web Data Service (API) • Parse HTTP requests • Validate parameters • Talk to the backend data store • Assemble representations of data • Serialize representations in JSON, XML, … • Set HTTP response headers • Generate appropriate error messages • Provide documentation about itself
Basic data service procedure • Parse URL • Determine operation and result format • Validate and clean the parameter values • Get data from the backend (using param. vals.) • Serialize the data in the selected format • Set HTTP response headers appropriately • If anything goes wrong, generate an error response
Introducing Web::DataService • On CPAN as Web::DataService • Built on top of Dancer • You define operations, parameter rules, output blocks, and it handles the rest • Complete enough for real use • Documentation still incomplete • Needs collaborators, testers, users
Important early decisions • Which framework to use • How to validate parameter values • How to organize your parameter space • How to handle output formats • How to implement the response procedure • How to handle versioning • How to report errors • How to handle documentation
Decisions that can wait • Which HTTP server to use • Which backend framework to use • Strategies for Caching and other performance enhancements
Plan for these from the start: • Multiple output formats • Multiple output vocabularies • Multiple protocol versions • Auto-generated documentation
Decision 1: which framework? • Dancer 1 • Dancer 2 • Mojolicious • Web::DataService
Decision 2: parameter values • How will the parameter values be validated and cleaned? • Recommendation: use HTTP::Validate
define_ruleset('1.1:taxa:specifier' => { param=> 'name', valid => \&TaxonData::validNameSpec, alias => 'taxon_name' }, "Return information about the most fundamental taxonomic name", "matching this string. The C<%> and C<_> characters may be used", "as wildcards.", { param=> 'id', valid => POS_VALUE, alias => 'taxon_id' }, "Return information about the taxonomic name corresponding to this", "identifier.", { at_most_one=> ['name', 'id'] } "You may not specify both C<name> and C<id> in the same query.");
Decision 2: parameter values • How will the parameter values be validated and cleaned? • Recommendation: use HTTP::Validate
Decision 3: parameter space • How will users specify which operation to do? • http://exmpl.com/service/some/thing? … • http://exmpl.com/service? op=something & …
Decision 4: output formats • How will users specify the output format? • http://exmpl.com/service/something.json? … • http://exmpl.com/service ? … & format=json… • Recommendation: separate the definition of output fields from output formats
$ds->define_block('1.1:taxa:basic' => { output => 'taxon_no', dwc_name => 'taxonID', com_name => ’oid' }, "A positive integer that uniquely identifies this taxonomic name", { output => 'record_type', com_name => 'typ', com_value => ’txn', dwc_value => 'Taxon', value => 'taxon' }, "The type of this record. By vocabulary:", "=over", "=item pbdb", "taxon", "=item com", "txn", "=item dwc", "Taxon", "=back", { set => 'rank', if_vocab => 'pbdb,dwc', lookup => \%RANK_STRING }, { output => 'rank', dwc_name => 'taxonRank', com_name => 'rnk' }, "The rank of this taxon, ranging from subspecies up to kingdom", { output => 'taxon_name', dwc_name => 'scientificName', com_name => 'nam' }, "The scientific name of this taxon", { output => 'common_name', dwc_name => 'vernacularName', com_name => 'nm2' }, "The common (vernacular) name of this taxon, if any", { set => 'attribution', if_field => 'a_al1', from_record => 1, code => \&generateAttribution }, … ); x x x x x x x x x x
Web::DataService provides: • Web::DataService::Plugin::JSON.pm • Web::DataService::Plugin::XML.pm • Web::DataService::Plugin::Text.pm • you can add your own • Output is delegated to the appropriate module based on the selected format
Decision 4: output formats • How will users specify the output format? • http://exmpl.com/service/something.json? … • http://exmpl.com/service ? … & format=json… • Recommendation: separate the definition of output fields from output formats
Decision 5: procedure • How will you handle the basic request-response procedure? • Recommendation: specify a set of attributes for each operation, and use a single body of code to handle operation execution
$ds->define_path({ path => 'taxa', class => 'TaxonData', output => '1.1:taxa:basic', doc_title=> 'Taxonomic names' }); $ds->define_path({ path => 'taxa/single', allow_format=> 'json,csv,tsv,txt,xml', allow_vocab=> 'com,pbdb,dwc', method => 'get', doc_title=> 'Single taxon' }); $ds->define_path({ path => 'taxa/list', allow_format=> 'json,csv,tsv,txt,xml', allow_vocab=> 'com,pbdb,dwc', method => 'list', doc_title=> 'Lists of taxa' });
Decision 5: procedure • How will you handle the basic request-response procedure? • Recommendation: specify a set of attributes for each operation, and use a single body of code to handle operation execution
Decision 6: versioning • How will users specify which protocol version? • http://exmpl.com/service/some/thing ? … & v=1.0 • http://exmpl.com/service1.0/some/thing ? … • Recommendation: make your users specify a version from the very beginning
Decision 7: error reporting • Recommendation: report errors in JSON if that format was selected • Recommendation: use the HTTP result codes • 400 Bad request • 404 Not found • 415 Unrecognized media type • 500 Server error • Recommendation: if your code throws an exception, report a generic message
Decision 8: documentation • Recommendation: auto-generate documentation as much as possible • Recommendation: a request using the base URL with no parameters should return the main documentation page
Other recommendations • Recommendation: know the HTTP protocol • Status codes (400, 404, 500, 301, etc.) • CORS ("Access-Control-Allow-Origin") • Cache-Control • Content-Type