150 likes | 297 Views
Cron Jobs, Asynchronous tasks and Deployment to google app engine. http://www.flickr.com/photos/maggiew/6145245962/. What if you want to perform computation even when nobody is visiting your site?. Options for deploying asynchronous tasks
E N D
Cron Jobs, Asynchronous tasks and Deployment to google app engine http://www.flickr.com/photos/maggiew/6145245962/
What if you want to perform computation even when nobody is visiting your site? • Options for deploying asynchronous tasks • Create a "cron" job that runs on Google App Engine on an automatic schedule • Create a "task" that repeatedly calls itself • Create a "backend" that can continually run • Create a script that runs on a computer elsewhere, that calls your server periodically
Concept of a cron job • Originally the name of a task-scheduling system provided in Unix • Now available on many platforms • General features: • Run periodically • Run at a certain time Chronospersonification/god of time http://en.wikipedia.org/wiki/Chronos
Mapping a URL to a servlet: web.xml <?xml version="1.0" encoding="utf-8" standalone="no"?> <web-app xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.5" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"> <servlet> <servlet-name>Weatherator</servlet-name> <servlet-class>edu.oregonstate.eecs.weatherator.WeatheratorServlet</servlet-class> </servlet> <servlet-mapping> <servlet-name>Weatherator</servlet-name> <url-pattern>/cron/weatherator</url-pattern> </servlet-mapping> <security-constraint> <web-resource-collection> <url-pattern>/cron/*</url-pattern> </web-resource-collection> <auth-constraint> <role-name>admin</role-name> </auth-constraint> </security-constraint> </web-app> • This web.xml file in WEB-INF specifies: • There is a servlet called Weatherator • Requests to /cron/weatherator should be handled by Weatherator • Only admin users (including cron service) can access this URL
Setting up a cron job: cron.xml <?xml version="1.0" encoding="UTF-8"?> <cronentries> <cron> <url>/cron/weatherator</url> <description>Gather weather alerts</description> <schedule>every 15 minutes</schedule> </cron> </cronentries> Put this cron.xml file in the WEB-INF directory. Indicates that a GET operation will be issued to the /cron/weathernator URL approximately every 15 minutes.
Structure of a typical servlet package edu.oregonstate.eecs.weatherator; @SuppressWarnings("serial") public class WeatheratorServlet extends HttpServlet { public void doGet(HttpServletRequestreq, HttpServletResponseresp) throws IOException { handleRequest(req, resp); } public void doPost(HttpServletRequestreq, HttpServletResponseresp) throws IOException { handleRequest(req, resp); } // rest of your code would go here … }
What about this Weatherator?? • Little servlet that… • Retrieves XML from National Weather Service showing a list of all weather alerts in Oregon • Parses the XML to get the list of alerts • Checks to see which alerts are new • Sends an email to me for each new alert
Other uses for asynchronous tasks • Performing a very long-running computation • E.g., Searching a data set for aliens, to help SETI • Periodic clean-up of your data store • E.g., Deleting old entities you don't need any more • Transforming data in your data store • E.g., Combining photos into a huge mosaic
Also useful for modifying many entities…Think about the example we had before… • How many entities are modified when the servlet is hit by a browser? • So how many entity groups are modified? • What if we wanted to modify more than 1 entity? Transaction trans = pm.currentTransaction(); trans.begin(); Test1Object obj = null; try { obj = (Test1Object)pm.getObjectById(Test1Object.class, title); } catch (JDOObjectNotFoundExceptiondoesNotExist) { obj = new Test1Object(); obj.setTitle(title); } pm.makePersistent(obj); trans.commit();
What if your servlet needs to modify entities in more than one group? • Use transactional tasks • GAE has a “queue” of tasks that will be handled asynchronously • You can schedule a task to occur if and only if a transaction successfully completes • Aka “transactional task” • GAE will keep retrying the task till it succeeds
Very common pattern Query = new query(“select keys of root entities”) If (cursor != null) slide query to cursor Restrict query to return 1 root entity at a time Result = execute query, thereby retrieving 1 root Start a transaction Modify entities in this root Slide cursor forward to next root entity Schedule task, pass it the cursor Commit transaction
Example: Modify entities in one group, fire off asynchronous task to handle next group String cursorPosition = request.getParameter("cursor"); Query query = pm.newQuery("select id from "+ Test3JDOEmployer.getName()); if (cursorPosition != null) { // for first group, cursorPosition == null; else == next one to handle Map<String, Object> tmpMap = new HashMap<String, Object>(); tmpMap.put(JDOCursorHelper.CURSOR_EXTENSION, Cursor.fromWebSafeString(cursorPosition)); query.setExtensions(tmpMap); } query.setRange(0, 1); // do one entity group per hit to this jsp List<String> toUpdate = (List<String>)(query.execute()); if (toUpdate != null && toUpdate.size() > 0) { Transaction trans = pm.currentTransaction(); trans.begin(); Test3JDOEmployer employer = (Test3JDOEmployer)(pm.getObjectById(Test3JDOEmployer.class, toUpdate.get(0))); // get the root of the entity group // … modify entities in the group Cursor cursor = JDOCursorHelper.getCursor(toUpdate); cursorPosition = cursor.toWebSafeString(); TaskOptions task = TaskOptions.Builder.withUrl("/lectures/pathtothisjsp.jsp"); task = task.param("cursor", cursorPosition); QueueFactory.getDefaultQueue().add(task); trans.commit(); }
A brief word about "backends" • Backend = an instance of Google App Engine that has no deadline • Even a cron job or a task must finish in 30-60 sec • Backend instances can take arbitrarily long • Options for running backends • Dynamic backends: start up & shut down automatically, depending on load • Resident backends: on until you shut em down
Summary: Important cloud concepts we've investigated using Google App Engine • Web application architecture • Data store: non-RDBMS support for entities • High scalability, potential inconsistency, no joins • Entity group: entities with a common root • Transaction: can only modify one entity group • Asynchronous tasks, cron jobs, backends • A few security issues: login, sessions, validating inputs, hashing passwords, escaping outputs