280 likes | 413 Views
Chapter 8 Cloud-Enabling GEOSS Clearinghouse. Kai Liu, Douglas Nebert , Qunying Huang, Jizhe Xia, and Zhenlong Li. Learning Objectives. Study the GEOSS Clearinghouse background and challenges Study how to deploy and optimize the GEOSS Clearinghouse onto cloud services
E N D
Chapter 8 Cloud-Enabling GEOSS Clearinghouse Kai Liu, Douglas Nebert, Qunying Huang, Jizhe Xia, and Zhenlong Li
Learning Objectives • Study the GEOSS Clearinghouse background and challenges • Study how to deploy and optimize the GEOSS Clearinghouse onto cloud services • Get familiar with GEOSS Clearinghouse
Learning Materials • Videos: • Chapter_8-Video_1.mp4 • Chapter_8-Video_2.mp4 • Chapter_8-Video_3.mp4 • Chapter_8-Video_4.mp4 • Chapter_8-Video_5.mp4 • Chapter_8-Video_6.mp4 • Scripts, Files and others: • geonetautoscaling.jason • geonetwork-2013-08-20.dump.tar.gz • geonetwork-2013-08-20.tar.gz • geoss-2013-08-20.tar.gz
Learning Modules • GEOSS Clearinghouse: background and challenges • Deployment and optimization • General steps • Special considerations • System demonstrations case • Local Search • Remote Search • Conclusion and discussions
GEOSS Clearinghouse: Background • GEOSS • Stands for Global Earth Observation System of Systems • Supports different Societal Benefit Areas (SBAs) including Agriculture, Biodiversity, Climate, Disasters, Ecosystems, Energy, Health, Water, and Weather • Three key components: GEOSS Registry, GEOSS Clearinghouse and Geoportal • GEOSS Clearinghouse • Engine of the GEOSS Common Infrastructure (GCI)
GEOSS Clearinghouse: Challenges • Big data: three “V”s • Volume: Harvest metadata from various catalogs • Velocity: Frequent updating • Variety: Various metadata standards (FGDC CSDGM Metadata Standards, Dublin core and ISO-19139) and web protocols • (CSW, SRU, RSS, WAF etc.) • Spatiotemporal search and full text search • Concurrent access
Learning Modules • GEOSS Clearinghouse: background and challenges • Deployment and optimization • General steps • Special considerations • System demonstrations case • Local Search • Remote Search • Conclusion and discussions
General deployment workflow 1. Authorize network access 12. Create a new AMI from the running instance 2. Launch an instance 11. Start the service and run test 3. Create an EBS volume 10. Configure the load balance, scalability 4. Attach the EBS volume to the instance 9. Configure servlet for GEOSS Clearinghouse 5. Install package (e.g.,Postgresql, PostGis,Tomcat) 8. Restore the GEOSS Clearinghouse database 6. Mount the EBS Volume 7. Transfer the GEOSS Clearinghouse codes/data into the instance • The process of deploying GEOSS Clearinghouse onto Amazon EC2 (boxes with blue color indicate the steps required special considerations)
Step 1 &2 1. Authorize network access 12. Create a new AMI from the running instance • Step 1: Authorize network access • port 22 • port 80 • Step 2: Launch Instance using Public AMI with PostgreSQL and PostGIS by seaching “PostgreSQL 8.4 PostGIS 1.5” in AMI search page 2. Launch an instance 11. Start the service and run test 3. Create an EBS volume 10. Configure the load balance, scalability 4. Attach the EBS volume to the instance 9. Configure servlet for GEOSS Clearinghouse 5. Install package (e.g.,Postgresql, PostGis,Tomcat) 8. Restore the GEOSS Clearinghouse database Video: Chapter_8-Video_1.mp4 0:00-3:04 Play Video(1-2) 6. Mount the EBS Volume 7. Transfer the GEOSS Clearinghouse codes/data into the instance
Step 3, 4, 5 & 6 1. Authorize network access 12. Create a new AMI from the running instance • Steps to customize the instance. • Step 3, 4 and 6 are optional, • which make the system more reliable with more storage capacity. • Step 5: Install packages • PostgreSQL/PostGIS: (AMI contains the packages; don’t need to install them again) • Tomcat servlet (e.g., install tomcat 7.33 to /opt/geoss) 2. Launch an instance 11. Start the service and run test 3. Create an EBS volume 10. Configure the load balance, scalability 4. Attach the EBS volume to the instance 9. Configure servlet for GEOSS Clearinghouse 5. Install package (e.g.,Postgresql, PostGis,Tomcat) 8. Restore the GEOSS Clearinghouse database Video: Chapter_8-Video_1.mp4 3:04-5:25 Play Video(5) 6. Mount the EBS Volume 7. Transfer the GEOSS Clearinghouse codes/data into the instance
Step 7 & 8 1. Authorize network access 12. Create a new AMI from the running instance • Step 7: Transfer the CLH code • and data root@ip-10-189-149-104:/mnt$ chownpostgres:postgresgeonetwork.dump root@ip-10-189-149-104:/mnt$ supostgres bash-3.2$ createdbgeonetwork bash-3.2$ psqlgeonetwork < geonetwork.dump 2. Launch an instance 11. Start the service and run test • Step 8: Restore the database 3. Create an EBS volume 10. Configure the load balance, scalability 4. Attach the EBS volume to the instance 9. Configure servlet for GEOSS Clearinghouse 5. Install package (e.g.,Postgresql, PostGis,Tomcat) 8. Restore the GEOSS Clearinghouse database Play Video(7) Video: Chapter_8-Video_1.mp4 5:25-end Video: Chapter_8-Video_2.mp4 Play Video(8) 6. Mount the EBS Volume 7. Transfer the GEOSS Clearinghouse codes/data into the instance
Step 9: Configure servlet for CLH 1. Authorize network access 12. Create a new AMI from the running instance • Install jdk and jre on the • instance (e.g., /usr/bin/java) iptables -t nat -I PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 8080 2. Launch an instance 11. Start the service and run test • For Security, use virtual user (e.g., tomcat) to run tomcat 3. Create an EBS volume 10. Configure the load balance, scalability groupadd tomcat useradd –s /sbin/nologin –g tomcat –d /opt/geoss tomcat passwd tomcat 4. Attach the EBS volume to the instance 9. Configure servlet for GEOSS Clearinghouse • Redirect port 80 to port 8080 (because Ports below 1024 can be opened only by root.) 5. Install package (e.g.,Postgresql, PostGis,Tomcat) 8. Restore the GEOSS Clearinghouse database Video: Chapter_8-Video_3.mp4 0:00-12:27 6. Mount the EBS Volume 7. Transfer the GEOSS Clearinghouse codes/data into the instance Play Video(9)
Step 9: Configure servlet for CLH(Cont’d) Video: Chapter_8-Video_3.mp4 • Enable the rules through re-booting by adding the following lines to /etc/network/interfaces pre-up iptables-restore < /etc/iptables.rules post-down iptables-save > /etc/iptables.rules • Add following lines to “/etc/rc.local”: enable CLH start automatically when the system boots up • sudo –u =/opt/geoss/apache-tomcat-7.0.33/bin/startup.sh • Add geonetwork services to HOST in tomcat/conf/server.xml <Context path=“/geonetwork” docBase=“/opt/geoss/apache-tomcat-7.0.33/webapps/geonetwork” crossContext=“false” relodable=“false”>
Step 10, 11 & 12 1. Authorize network access 12. Create a new AMI from the running instance • Set the URL for remote search • Load balancing and scalability 2. Launch an instance 11. Start the service and run test 3. Create an EBS volume 10. Configure the load balance, scalability 4. Attach the EBS volume to the instance 9. Configure servlet for GEOSS Clearinghouse 5. Install package (e.g.,Postgresql, PostGis,Tomcat) 8. Restore the GEOSS Clearinghouse database Video: Chapter_8-Video_3.mp4 12:27-end Play Video(10-12) 6. Mount the EBS Volume 7. Transfer the GEOSS Clearinghouse codes/data into the instance • Set Remote Server Host and Port
Special Considerations • Data backup: • Elastic Block Store (EBS) Volume • Used to store data, log files and application from the volume in case the current instance crashes • Size could vary from 1GB to 1TB
Steps for Data Backup Video: Chapter_8-Video_4.mp4 • Step 1: Create an EBS volume from scratch with no content in web console and make sure the select the EBS volume zone is the same as the zone of GEOSS clearinghouse instance • Create a new EBS volume, and attach to the instance
Step 2: Attach the volume to the running instance • Step 3: Mount the EBS to the file system [root@ip-10-189-149-104~] mkfs -t ext3 /dev/sdh# make a file system [root@ip-10-189-149-104~] mkdir /mnt/datavol_1 [root@ip-10-189-149-104~] mount /dev/sdh /mnt/datavol_1/
Special Considerations • Load balancing: Video: Chapter_8-Video_5.mp4 • Configure load balance service
Special Considerations • Auto-scaling Video: Chapter_8-Video_6.mp4 • Using CloudFormation service to configure auto-scaling capability through the web console
Or, configure auto-scaling through Command line using a template cfn-create-stack GEOSSClearinghouse --template-file GEOSSClearinghouse --template-file GEOSSClearinghouse.template --region us-east-1 --awsaccesskey=FAKEKEY --awssecretkey=FAKEKEY2 --parameters=“KeyName=GeoNet; InstanceType=m1.large”
Learning Modules • GEOSS Clearinghouse: background and challenges • Deployment and optimization • General steps • Special considerations • System demonstrations case • Local Search • Remote Search • Conclusion and discussions
Url: http://ec2-50-19-223-225.compute-1.amazonaws.com/geonetwork • Main Page
Local Search: search records through CLH interface and visualization the map services • Search Results of Global "Rain-Use Efficiency"
Remote Search: search records through CLH remote protocols (CSW, SRU & RSS) • Search Results of Global "Rain-Use Efficiency" from GEO Portal
Learning Modules • GEOSS Clearinghouse: background and challenges • Deployment and optimization • General steps • Special considerations • System demonstrations case • Local Search • Remote Search • Conclusion and discussions
Advantages for hosting CLH on Cloud • Economic advantages • Technique advantages: • Scalability • Highly Reliable Environments
Discussion Questions What are the general steps of deploying GEOSS Clearinghouse onto the cloud? What are the differences from the general steps in Chapter 5? How to attach and use Amazon EBS volume? What kind of cloud services can be used to balance the system load? Discuss how to use them? What scalable services are provided by AWS? How to use them? Using GEOSS Clearinghouse as an example, explain the technical advantages of cloud-enabled Geoscience applications.
References • GEO. 2009–2011 Work Plan [online], 2009. http://www.earthobservations.org/documents/work%20plan/geo_wp0911_rev2_091210.pdf. (Accessed January 4, 2013). • Goodchild, M. F., M. Yuan, and T. J. Cova. 2007. Towards a general theory of geographic representation in GIS. International Journal of Geographical Information Science21, no. 3:239–260. • Huang, Q., D. Nebert, C. Yang, and K. Liu. 2011. GeoCloud Project Report—CLH [online]. http://www.fgdc.gov/initiatives/geoplatform/geocloud/reports/fgdcgeocloud-project-report-geonetwork.pdf. (Accessed March 4, 2013). • Liu, K., C. Yang, W. Li, Z. Li, H. Wu, A. Rezgui, and J. Xia. 2011. The CLH High Performance Search Engine. The 19th International Conference on Geoinformatics, June 24–26, 2011, Shanghai, China. • Yang, P., J. Evans, M. Cole, N. Alameh, S. Marley, and M. Bambacus. 2007. The emerging concepts and applications of the spatial Web portal. Photogrammetric Engineering and Remote Sensing73, no. 6:691. • Yang, C. and R. Raskin. 2009. Introduction to distributed geographic information processing research. International Journal of Geographical Information Science23, no. 5:553–560