170 likes | 299 Views
GEC21 Experimenter/Developer Roundtable (Experimenter). Paul Ruth RENCI / UNC Chapel Hill pruth@renci.org. ADCIRC Storm Surge Model. Finite Element Very high spatial resolution (~1.2M triangles) Efficient MPI implementation, scales to thousands of cores
E N D
GEC21 Experimenter/Developer Roundtable(Experimenter) Paul Ruth RENCI / UNC Chapel Hill pruth@renci.org
ADCIRC Storm Surge Model • Finite Element • Very high spatial resolution (~1.2M triangles) • Efficient MPI implementation, scales to thousands of cores • Typically use 256-1024 cores for forecasting applications • Used for coastal flooding simulations • FEMA flood insurance studies • Forecasting systems • Research applications
ADCIRC Storm Surge Model on GENI • Slice attributes • 38 VMs (152 compute cores) • 6 GENI racks • Custom image replicated across testbeds • ExoGENI • Groups • Storage (1 TB) • InstaGENI • Shared VLANs • Xen VMs (4 core, 4 GB memory, 30 GB storage) • Inter-domain • GENI Stitching • ExoGENI Stitching
Demo GENI Slice InstaGENI ExoGENI W NFS … W NFS … W W NFS … W W Condor Scheduler NFS … W W OVS iSCSI Storage
Demo GENI Slice InstaGENI ExoGENI W NFS … W NFS … W W Not just one slice! 3 slices 1 Stitcher call 4 Omni calls NFS … W W Condor Scheduler NFS … W W OVS iSCSI Storage
Demo GENI Slice InstaGENI ExoGENI
Slice 1: ADCIRC-CORE • 1 call to the stitcher • 5GENI calls by the stitcher InstaGENI ExoGENI W NFS … W NFS … W W GENI Calls: ExoSM Ion GPO-IG ILL-IG WI-IG NFS Condor Scheduler NFS OVS iSCSI Storage
Shared VLANs Perform Operational Action: Create Shared VLANS InstaGENI ExoGENI W NFS … W NFS … W W NFS Condor Scheduler NFS OVS iSCSI Storage
Slice 2: Wisconsin Group InstaGENI ExoGENI W NFS … W NFS … W W NFS … W W Condor Scheduler NFS OVS iSCSI Storage
Slice 2: Illinois Group InstaGENI ExoGENI W NFS … W NFS … W W NFS … W W Condor Scheduler NFS … W W OVS iSCSI Storage
Complete Slice(s) InstaGENI ExoGENI W NFS … W NFS … W W NFS … W W Condor Scheduler NFS … W W OVS iSCSI Storage
Experiences • Happy that it works at all! • GENI Stitching • Failures can cascade • Limited to one stitch per slice per ExoGENI site. • Can restart complicated ExoGEN slice • InstaGENI • Slow booting compute nodes • Yay! Bigger nodes (4 cores, 30 GB disks) • Shared VLANs are awkwards • Tools • None of the tools can visualize all aspects of slices
GEC21 Experimenter/Developer Roundtable(Developer) Victor Orlikowski Duke University vjo@duke.edu
ORCA 5 • Recovery • Should not be seen by users • Restart and redeploy ExoGENI services without effecting running slices. • Storage on bare metal • Hybrid mode • Distributed actor registry
Rspec Extension • Storage • ~5 TB of sliverable storage on most racks • Add an iSCSI target to a slice • Can now be used with Rspecs <node client_id="egS-storage" component_manager_id="urn:publicid:IDN+exogeni.net:rcivmsite+authority+am" exclusive="false"> <sliver_type name="storage"> <storage:storageresource_type="LUN" do_format="true" fs_param="-F -b 1024" fs_type="ext4" mnt_point="/mnt/storage" capacity="100"/> </sliver_type> <interface client_id="egS-storage:if0"/> </node>
RSpec Extension: Groups • Multiple nodes with shared specification • Same: image, instance type, location, networks • Templatedpostboot scripts customize the nodes <node client_id="eg-node" component_manager_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+am" exclusive="false"> <sliver_type name="XOMedium"> <disk_image name="http://geni-images.renci.org/images/standard/centos/centos6.3-v1.0.11.xml" version="776f4874420266834c3e56c8092f5ca48a180eed"/> </sliver_type> <nodegroup:nodegroup count=“10"/> </node>