1 / 9

UC Cloud Summit 2011 – LBL Campus Update

UC Cloud Summit 2011 – LBL Campus Update. Krishna Muriki, (kmuriki@lbl.gov) High Performance Computing Services (HPCS), IT Division. HPC activities – Condo Cluster Computing: Model A new cluster support model. To achieve flexibility, sharing and better utilization of hardware.

dobry
Download Presentation

UC Cloud Summit 2011 – LBL Campus Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UC Cloud Summit 2011 – LBL Campus Update Krishna Muriki, (kmuriki@lbl.gov) High Performance Computing Services (HPCS), IT Division.

  2. HPC activities – Condo Cluster Computing: • Model • A new cluster support model. • To achieve flexibility, sharing and better utilization of hardware. • PIs purchase cluster hardware (nodes, leaf switches & cables). • PIs can purchase any additional storage other than provided. • HW in this condo has to be refreshed before 4 years. • PIs get free compute time equivalent to their contribution. • Making a Condo • HW connected and shared with institutional cluster Lawrencium. • PI purchased storage will be accessible on all the condo nodes. • Scheduling policies tuned to give faster turn around time for PI jobs. • Advantages • Monthly cluster support charges are waived. • Flexibility for PIs to use more resources (than purchased) when needed. • Easy mechanism to share idle resources to other users in the Lab.

  3. IT Cloud Developments: • Evaluating services for the past 3 years • Google Apps including Google Docs and Sites • Google Calendar • Collaborative services like Manymoon and Smartsheet. • Business Systems • Point and Ship (for managing shipping) • Daptiv (Ops project management) • All systems leverage IT’s identity Management Infrastructure (SAML/Shib). • Future Cloud Developments: • Additional Google apps like Google code, Reader & Picasa • Taleo, a SaaS Talent Management Application. • Carbonite, a SaaS service for user-managed desktop backups.

  4. IT service - Virtual Machine Hosting: • VMWARE based virtual machine environment • Over 100 virtual machines running • IT service - Cloud Hosting (Amazon EC2 server): • Provides computing resources on Amazon’s AWS Platform. • CentOS AMIs with standard IT monitoring tools • Option to create a VPN connection to LBL. • IT manages the OS and Amazon layers

  5. Support from Network Infrastructure - ESNET • ESnet peers with multiple cloud providers (including Amazon, Google, Microsoft). • When possible, we peer in multiple locations (Bay Area, Chicago, etc) and we're eager to peer with other providers as well • We're interested in pushing advanced network services (including virtual circuits and performance monitoring) into cloud contexts • Multiple DOE-funded scientists are actively researching clouds for computation, services, storage. • Several DOE sites are sourcing cloud services • Questions about ESnet and cloud?  Please send email to routing@es.net

  6. Experiments with Amazon EC2 services: • Seeking Supernovae in the Clouds : A Performance Study • – K. Jackson, L. Ramakrishnan, K. Runge, R. Thomas. • AWS can be very useful for scientific computing. • Porting today requires significant effort. • Failures occur frequently and application must be able to handle them gracefully. • Performance Analysis of HPC applications on the AWS Cloud • – K. Jackson, L. Ramakrishnan, K. Muriki, S. Cannon, S. Cholia, J. Shalf, H. Wasserman, N. Wright. • Data shows that the more communication in application, the worse EC2 performance becomes. • Variability introduced by the shared nature of the virtualized environment causes significant variability in EC2 performance. • Berkeley Lab Contributes Expertise to New Amazon Web Services Offering • “When we applied these tests to the new Cluster Computer Instances for Amazon EC2, we found that the new offering performed 8.5 times faster than the previous Amazon instance types.” --K. Jackson.

  7. Cloud computing for Science. • -- G. Bell, K. Jackson, G. Kurtzer, J. Li, K. Muriki, L. Ramakrishnan, J. White. • Large scale MPI has a high overhead on EC2. • Enables data-intensive science.

  8. HPC Cloud Applied to Lattice Optimization • – C. Sun, H. Nishimura, S. James, K. Song, K. Muriki, Y. Qin • “Increased performance of the recently introduced AWS CCI instances better meet the needs of scientific community, however EC2 may work less well for large-scale parallel applications that depend heavily on memory and interconnect performance. • It remains important for researchers to benchmark their particular application and review the local costs when making a decision to use the cloud.”

  9. NERSC Initiatives: • Magellan Project Mission: • Determine the appropriate role for commercial and/or private cloud computing for DOE/SC midrange workloads • Deploy a test bed cloud to serve the needs of mid-range scientific computing. • Evaluate the effectiveness of this system for a wide spectrum of DOE/SC applications in comparison with other platform models.

More Related