630 likes | 786 Views
Tutorial for Web Mining Project -cloud computing platform. Introduction. In mis510 project, your team is required to create a web business, with a complete web site and business functionalities for specific customers, using either Google App Engine or Amazon EC2 platform.
E N D
Introduction • In mis510 project, your team is required to create a web business, with a complete web site and business functionalities for specific customers, using either Google App Engine or Amazon EC2 platform. • Since Google App Engine and Amazon EC2 have distinct interfaces, service features and pricing policies, this tutorial gives instructions of how to use these platforms respectively.
Google App Engine TutorialWritten by Jonathan Jiang updated by Julian Guo
Overview • A cloud platform for publishing web application . • Simple, web-based application management console. • Developers can focus on application logic, no need to worry about hardware ,system administration, scalability etc. • Support Java, Python, and Go.
Guideline • 0. Preparation • 1. Create a Google Web Application Project • 2. Debug, Run and Deploy • 3. Interaction with User • 4. Use Cloud Database • 5. Pricing
0.Preparation • 0.1. Sign up a Google App Engine account: • https://appengine.google.com/start • 0.2. Download App Engine SDK • http://code.google.com/appengine/downloads.html • 0.3. For Java/Eclipse users, it is recommended to download Eclipse Plugins to build, debug and deploy your application. • http://code.google.com/eclipse/docs/download.html
0.1. Sign up a Google App Engine account: You need to login to your gmail account to see this page. Sometimes, your xxx@email.arizona.edu account does not work. If so, sign up a new one. This is your application ID, write it down.
Steps 0.2 and 0.3 • Steps 0.2 and 0.3 can be combined in Eclipse: • Help->Install New Software • Type https://dl.google.com/eclipse/plugin/3.7 in the “Work with” and press Enter. • Then choose the required packages and download them. These are required. Others are optional
1.1 Create a New Project • Now you should be abele to create a Google AppEngine project in Eclipse New->Web Application Project • Type the project name and package you like, then choose the Google SDKs you want to use. Typically you only need ‘Use Google App Engine’ for your SDK.
1.2 File Structure of the Web Application src/ includes all source files for your application. Java source codes META-INF/ includes other configuration files war/ includes all the files that are deployed and actually used on the server. Images, data, HTML and JSP files are put directly under /war folder. WEB-INF/ includes used libraries, compiled classes and configuration files.
1.2 File Structure of the Web Application • In WEB-INF folder, there are two configuration files. • appengine-web.xml • web.xml • The first five lines of appengine-web.xml looks like • Don’t forget to add your registered application ID between <appliction> tags. • Web.xml is SUPER IMPORTANT. It is mainly responsible for mapping URIs to your servlet classes and web pages (Examples are provided later.)
2.1.Debug and Run • Eclipse plugin has already created a Hello World example for you. You can directly run your project and test if it works. • Right click on the project folder-> Debug As Web Application. • In Debug mode, Google App Engine will create a server on your local machine, and your project will run on that local server. • If it is running successfully, the console will display a line like: If you use Eclipse, the server is running at http://localhost:8888/ • You can open a web browser and paste the link above to test you project.
2.1.Debug and Run • When the server is running in debug mode, any changes to your project files should be automatically detected by Google App Engine, so you don’t have to rebuild the project (but still you need to refresh the browser to see the changes). • *Don’t over-trust this statement. When you always encounter the same error, it is very likely that just rebuilding the project will help you out. • An exception is web.xml. If you make changes to it, you must rebuild your project.
2.2.Deploy • When you are satisfied with your application, you can deploy it to the cloud environment Google provides so that users all over the world have access to it. • Simply click the ‘Deploy’icon, and enter your account information for the AppEngine Account. • Now you can visit your application at • http://your-applicationID.appspot.com
3. Interaction with User Your Application User Input • Often, you want your application not only to present static information, but also to interact with users. • Your system needs to pass user inputs from web pages to your Java or Python program. • Here we provide a JSP/Java example of a movie related web mining application. This example returns movie’s plot based on the movie name given by users. Web Pages/ API Web Mining Component (Server Side Logic) Interface Output
3.1 Receive User Input • Create form_input.jsp, add the following lines between the <body> </body> tags. • When the user visits form_input.jsp. It will show a field for input: • You want to pass the input to your Java Servlet application (your background program), say, SampleServlet.java
3.1 Receive User Input • You need to configure web.xml to let the system know how to map the form submission URI to the appropriate Java class. The following example shows such a mapping: • http://your application ID.appspot.com/processinputSampleServlet.class
3.2 Process Use Input • Copy the following code to SampleServlet.java • Use req.getParameter() method to obtain the user input (movie name) and process it in SampleServlet.java. An external API is used to retrieve the movie’s plot from web.
3.2 Process Use Input • Here’s a snippet of the API use code. The complete sample code is given in ‘samplecode.rar’.
3.3 Return the Output to User • Now you can display the results to user by adding a line to the designated jsp page. In this example, we use the same jsp page as user input. Now the form_input.jsp should look like: • Try it in http://localhost:8888/form_input.jsp
4. Use Cloud Database • Situations where using cloud database may help: • Remember user activities. • Store the results of web mining process to speed up next inquiry. • Upload a large file which is a component of your application. • …. • In next slides we show an example of using Google Datastore to save and retrieve users’ comments for movies.
4. Use Cloud Database • Updating the form_input.jsp to receive user comments:
4.1 Google Datastore • 4.1.1 Store Comments • Add this component to SampleServelet.java • (For complete sample, please refer to samplecode.rar)
4.1 Google Datastore • 4.1.1 Store Comments • Modify SampleServelet.java as: • (For complete sample, please refer to samplecode.rar)
4.1 Google Datastore • 4.1.1 Store Comments • Modify SampleServelet.java as (cont’d): • (For complete sample, please refer to samplecode.rar)
4.1 Google Datastore • 4.1.1 Retrieve Comments • Add this component to form_input.jsp, before <html> • (For complete sample, please refer to samplecode.rar)
4.1 Google Datastore • 4.1.1 Retrieve Comments • Add this component to form_input.jsp, inside <body> and </body>. • (For complete sample, please refer to samplecode.rar)
4. Use Cloud Database • Advantages of Google Datastore: • Google provides data management capacity for you. • Very Flexible (schemaless) • Option to view & manage the data online • Login to Google App Engine:https://appengine.google.com/, choose your application-> Datastore Viewer • Disadvantages: • Limit of 1GB free data storage quota, compared to Amazon EC2(10GB). • Only for small data object(entity) in Datastore. • To store larger data, Google Blobstore can be used. • http://code.google.com/appengine/docs/java/blobstore/overview.html
5. Cost Google App Engine sets a resource usage quota for free application. Free Quota for Major Resources For more details: https://cloud.google.com/products/app-engine/#pricing
5. Pricing For resource usage exceeding the quota, Google charges at the price rates below. Billing Rate for Major Resources
5. Pricing • Costs vary greatly depending on different resource usage. The following table lists a rough estimation of daily costs for typical apps:
5.Pricing • Suggestions for reducing cost. • Login to App Engine Console and set daily budget. • Reduce instance hours • Datastore is expensive • Debug on your local server most of the time (completely free!). Deploy the full version of your app only during last weeks of the mis 510. • Applying these suggestions will reduce the cost for projects. • This is the safest way to control your cost, but resource usage exceeding this budget will not be allowed (so your app throw errors.)
Amazon Elastic Compute Cloud (Amazon EC2) Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers. simple web service interface complete control of your computing resources fast obtain and boot new server instances quickly scale capacity as your computing requirements change pay only for capacity that you actually use
Tutorial Guideline 1. Sign up EC2 2. Launch an Instance 3. Connect to Windows Instance 4. Connect to Unix/Linux Instance 5. Application Example 6. Pricing 7. Resources
1. Sign Up EC2 Sign up an Amazon EC2 Account: http://aws.amazon.com/ec2/ If you have an Amazon Shopping Account, just use this account.
2. Launch an Instance Sign in AWS Management Console (choose EC2): http://aws.amazon.com/console/
Create and Download a Key Pair A key pair is a security credential similar to a password, which you use to securely connect to your instance after it's running.
Choose an Amazon Machine Image (AMI) Amazon Linux Windows Server 2008 with SQL Server Red Hat/Ubuntu/Debian Linux Just like choosing a virtual machine You can choose 64-bit or 32-bit machines Prices for different machines are different
Configure Firewall (create a security group) Create rules to get access to instance For a windows server, we need HTTP port 80, MS SQL port 1433, Remote Desktop port 3389 and HTTP 8080 (for Tomcat). For Linux, we need SSH to login (to use PuTTY and WinSCP).
3. Connect to Windows Instance Go to the AWS Management Console and locate the instance on the Instances page. Right-click the instance and select Get Windows Password.
Get an elastic IP (static IP) Click “Elastic IP” in “Navigation” Click “Allocate New Address” Associate Address to your instance Elastic Address is desirable resource. You should release the address, if you don’t want to associate it to any instance. Otherwise, Amazon will charge you money!