1 / 6

RDAV Update

RDAV Update. Phil Andrews Science Advisory Board Meeting 20-21 January 2011. Executive summary. Nautilus SGI UltraViolet passed all acceptance criteria and was accepted by NICS/RDAV in September 2010. RDAV resources have gone through TRAC allocations twice and we have currently active users.

judah
Download Presentation

RDAV Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RDAV Update • Phil Andrews • Science Advisory Board Meeting20-21 January 2011

  2. Executive summary • Nautilus SGI UltraViolet passed all acceptance criteria and was accepted by NICS/RDAV in September 2010. • RDAV resources have gone through TRAC allocations twice and we have currently active users. • Allocations are lower than expected. • Use is lower than expected. • Early users are transitioning to TRAC / Director’s Discretionary users.

  3. Hardware status • Nautilus: The full SGI UltraViolet has been delivered and accepted: • The full UltraViolet machine was delivered and integrated into the NICS infrastructure. • There were issues with stability and PCIe performance during the acceptance tests. Those issues have been resolved and the machine accepted. • Graphics cards: There are issues that prevent delivery of the GPUs on Nautilus: • NVIDIA has decided not to scale their driver to support more than 8 GPUs per single system image. Thus, we cannot deliver the full 16 GPUs on Nautilus. • Even worse, there are communication problems on the UltraViolet that cause system stability problems when the GPUs are exercised. Until this is resolved, we have disabled access to the GPUs. • Parallel filesystem: The 960 TB GPFS parallel filesystem has been deployed on Nautilus. • We continue to explore issues with bandwidth, as we are seeing only ~1 GB/s. It appears to be a design issue with GPFS. We are talking with IBM about these issues. • We are working to enable cross-mounting of the parallel filesystem on Kraken for HPC users. • Portal: Our portal system is operational and we are working to deploy new capabilities on it.

  4. Software and environment status • Software systems: • VisIt has been ported and runs well. It was a major component of our acceptance tests. • ParaView porting has begun. • Remote visualization systems are deployed and secure: NX, VNC • Workflow systems work well. • R runs acceptably, and several packages for exploiting parallelism have been deployed to users. • User environment: • We continue to explore issues related to job placement, and have deployed a NUMA-aware Torque for scheduling. • We are moving batch scheduling and login processes to a separate system to reduce user contention.

  5. Allocated projects

  6. Education, Outreach, and Training activities • Presented a tutorial on Nautilus usage for visualization, data analysis, and workflow management at the TeraGrid'10 conference in Pittsburgh. • With LLNL and LBNL, taught a full day class on VisIt at the Supercomputing 2010 conference in New Orleans.

More Related