330 likes | 484 Views
Chapter 5 Cloud-Enabling Geospatial Applications Kai Liu, Qunying Huang, and Jizhe Xia. Learning Objectives. Summarize the common components for geospatial applications Summarize the general steps to deploy cloud-enabled
E N D
Chapter 5 Cloud-Enabling Geospatial Applications Kai Liu, Qunying Huang, and Jizhe Xia
Learning Objectives • Summarize the common components for geospatial • applications • Summarize the general steps to deploy cloud-enabled • geospatial applications, and study the detailed steps using • two use cases
Learning Materials • Videos: • Chapter_5-Video_1.mov • Chapter_5_Video_2.mp4 • Scripts, Files and others: • hpc.zip
Learning Modules • Common Components for Geospatial Applications • Server-side scripting • Database • HPC • General Steps to Deploy Cloud-Enable Geospatial Applications • Use Cases • Database-driven web applications • Typical HPC Applications • Conclusion and discussions
Common Components for Geospatial Applications:Server-side scripting • Server-side scripting • Executed on the server • Popular Server-side scripting languages • ASP • PHP • JSP • Perl and Ruby
Common Components for Geospatial Applications:Spatial database • Spatial database Database with spatial plug-ins or data types to enable spatial data analysis, e.g. • PostGIS plug-in of PostgreSQL • Oracle Spatial • Spatial engine of Microsoft SQL server
Common Components for Geospatial Applications:HPC • HPC • High Performance Computing (HPC) • Parallel Computing is a popular way for HPC. There are two task decomposition methods: • Domain decomposition • Functional decomposition • HPC open source solutions, e.g., • Condor • MPICH2 • Hadoop Mapreduce
Learning Modules • Common Components for Geospatial Applications • Server-side scripting • Database • HPC • General Steps to Deploy Cloud-Enable Geospatial Applications • Use Cases • Database-driven web applications • Typical HPC Applications • Conclusion and discussions
General Steps to Deploy Cloud-enabled Geospatial Applications • Set up environments • Software and Libraries (e.g., HTTP server, database server and JRE) • Environment variables (e.g., JAVA_PATH) • Deploy application • Customization of the application (e.g., database, storage services, email services and log services)
Learning Modules • Common Components for Geospatial Applications • Server-side scripting • Database • HPC • General Steps to Deploy Cloud-Enable Geospatial Applications • Use Cases • Database-driven web applications • Typical HPC Applications • Conclusion and discussions
Database-driven web application 1. Authorize network access 8. Create a new AMI from the running instance 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance • The procedure of deploying Drupal site onto EC2 • Video: Chapter_5-Video_1.mov
Step 1, 2 & 3 1. Authorize network access 8. Create a new AMI from the running instance • Port 22 for SSH and Port 80 for HTTP should be opened for login and HTTP access • Drupal is supported by most Linux versions but Ubuntu is highly recommended by the Drupal community • Linux and Mac users could use the SSH command in a terminal to login to the instance, while Windows users could use Putty 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance
Step 4Set up Environments 1. Authorize network access 8. Create a new AMI from the running instance $: sudo apt-get update $: sudo apt-get install apache2 $: sudo apt-get install mysql-server mysql-client $: sudotasksel install lamp-server • Install and configure the Apache HTTP server and the MySQL DBMS. The following commands can be used in the console: 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance • Secure MySQL Installation • $: sudomysql_secure_installation
Step 4 Set up Environments(cont’d) • Enable Apache Rewrite Module 1. Authorize network access 8. Create a new AMI from the running instance $: sudo a2enmod rewrite 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 • Create a MySQL Account 6. Deploy Drupal site 3. Login to the instance • $: sudomysqladmin -u root -p create ec2drupal • $: mysql -u root -p • mysql> create database drupal; • mysql> grant all privileges on drupal.* to ec2drupal@localhost identified by 'your_password'; • mysql> flush privileges; • mysql> \q 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance
Step 5 Transfer files onto instance 1. Authorize network access 8. Create a new AMI from the running instance $: wget http://ftp.drupal.org/files/projects/drupal-7.19.zip • wget is supported by most Linux systems to download files from the web. 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance
Step 6 Deploy the application 1. Authorize network access 8. Create a new AMI from the running instance $: sudo apt-get install unzip $: sudo unzip drupal-7.19.zip $: sudomv drupal-7.19 /var/www/drupal $: sudochown www-data:www-data /var/www/drupal –R $: sudo service apache2 restart • Install unzip utility and extract the Drupal files • Deploy the application 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance
Step 7Customization 1. Authorize network access 8. Create a new AMI from the running instance Customize Drupal’s database and administration management through its interface: http://INSTANCEIP/Drupal/install.php INSTANCEIP should be replaced with the targeted IP address. 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance
Step 8Create a new AMI 1. Authorize network access 8. Create a new AMI from the running instance • The last step is to create a new AMI based on the running instance. Chapter 4, Section 4.3.1 describes the creation of an AMI. The image can be reused to create multiple virtual instances. 7. Customization of the application 2. Launch an instance with Ubuntu 12.04 6. Deploy Drupal site 3. Login to the instance 4. Set up environments, e.g., Apache HTTP, DBMSs. 5.Transfer Drupal files onto instance
Learning Modules • Common Components for Geospatial Applications • Server-side scripting • Database • HPC • General Steps to Deploy Cloud-Enable Geospatial Applications • Use Cases • Database-driven web applications • Typical HPC Applications • Conclusion and discussions
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example 8. Run the DEM interpolation 1. Authorize network access 2. Launch a cluster instance as the head node 7. Transfer the DEM data and interpolation code to the head node 3.Install the middleware packages, e.g. Condor 6. Configure the middleware on both nodes to enable communication 4. Create a new AMI from the running instance 5. Start another instance from the new AMI as a computing node • The process of configuring an HPC system to run DEM interpolation on EC2 (Blue color indicates the additional steps for configuring a virtual HPC environment) • Video: Chapter_5-Video_2.mp4
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example • Step 1 : Authorize network access • Port 22 for SSH and the ports for the communication managed by middleware between master node and computing nodes (e.g., 9000-9999 in this case) should be opened. • EC2 provides cluster instances for running HPC applications. Users can also select High-CPU Instances or High Memory Cluster Instances depending on whether the geospatial application is data- or computing-intensive • Video: Chapter_5-Video_2.mp4 • 0:00 – 1: 10
Select instance type based on the CPU, memory, and networking requirements of the geospatial HPC application
Step 3. Install the middleware packages HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example $ rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm ## install additional packages $ yum install yum-plugin-priorities $ rpm -Uvh http://repo.grid.iu.edu/osg-el6-release-latest.rpm $ yum install condor $ touch /etc/condor/condor_config.local ## Create Condor configuration file • Video: Chapter_5-Video_2.mp4 • 1:10 – 8:46 • Step 2 : Lunch an instance In this example, Condor is used as the middleware solution
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example Add the following content to the configuration file “/etc/condor/config.d/local.conf” ## OSG cluster configuration # List of daemons on the node (Condor central manager requires collector and negotiator, # schedd required to submit jobs, startd to run jobs) DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR Start the condor service • $ service condor start ## Start the service • $ service condor stop ## Stop the service
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example Condor Commands
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example [root@domU-12-31-39-13-DD-FF ~]# chmoda+xjdk-6u14-linux-x64.bin [root@domU-12-31-39-13-DD-FF ~]# ./jdk-6u14-linux-x64.bin Condor needs Java Development Kit (JDK)
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example Step 4. Create a new AMI from the running instance This step is to store the state of configuring the HPC environment in case the master node crashes. In addition, new computing node instances can be launched directly from this AMI, and all software dependencies will already have been installed and configured. • Step 5. Start another instance • This step is to launch other instances from the new AMI as a computing node as indicated in step 2 (except with a different AMI) • Video: Chapter_5-Video_2.mp4 • 8:46 – 10:56
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example Step 6. Configure the middleware on all nodes to enable communication DAEMON_LIST = MASTER, SCHEDD, STARTD On the computing node, the configuration file “/etc/condor/config.d/local.conf” should be changed to: • Step 7. Transfer the DEM data and interpolation code • Use scp command • Video: Chapter_5-Video_2.mp4 • 10:56 – 18:00
HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example Step 8. Run the DEM interpolation Universe = java Executable = interpolate.class Arguments = interpolate DEMfile.txt ## interpolate is the java main program, and DEMfile.txt is the input initialdir = dir.$(Process) ## input directory output =../interpolate.output.$(Process) ## output file error = interpolate.error.$(Process) log = ../interpolate.log requirements = (Memory > 1024) # Select machine with memory size bigger than 1024Mb transfer_input_files = MyPoint.class, PngWriter.class,interpolate.class, cutfile.txt should_transfer_files = ALWAYS when_to_transfer_output = ON_EXIT queue 12 ## concurrent process numbers An example of the submission file: • MyPoint.class, PngWriter.class, and interpolate.class are the Java programs, and DEMfile.txt is the input for each task.
Step 8. Run the DEM interpolation HPC Application Deployment onto the Cloud >> Use DEM interpolation as an example [root@domU-12-31-39-13-DD-FF ~]# su condor # use the condor account [root@domU-12-31-39-13-DD-FF ~]# condor_submitinterpolate_submit The command “condor_submit” can be used to submit the tasks to the cluster using a submission file named “interpolate_submit” as follows: • Video: Chapter_5-Video_2.mp4 • 18:00 – 21:07
Learning Modules • Common Components for Geospatial Applications • Server-side scripting • Database • HPC • General Steps to Deploy Cloud-Enable Geospatial Applications • Use Cases • Database-driven web applications • Typical HPC Applications • Conclusion and discussions
Conclusions and Discussions How to select the optimal cloud services and configurations? Enumerate some other common components for geospatial applications besides server-side scripting, database and HPC. What are the general steps to deploy a geospatial application onto cloud services? What is the database service provided by Amazon AWS? How to use it? What is the database service provided by Windows Azure? How to use it? Can you list some other geospatial applications?
References • Clarke, K. C. 2003. Geocomputation’s future at the extremes: High performance computing and nanoclients. Parallel Computing29, no. 10: 1281–1295. • Huang, Q. and C. Yang. 2011. Optimizing grid computing configuration and scheduling for geoscience analysis—An example with interpolating DEM. Computers & Geosciences37, no. 2:165–176. • Mower, J. E. 1996. Developing parallel procedures for line simplification. International Journal of Geographical Information Systems10, no. 6: 699–712. • Xie, J., C. Yang, B. Zhou and Q. Huang. 2010. High performance computing for the simulation of dust storms. Computers, Environment and Urban Systems34, no. 4: 278–290.