190 likes | 227 Views
Explore how UCF is maximizing campus participation with the 'STOKES' machine, supporting scientific exploration, synergy between science-based modeling and human interaction, all aimed at expanding research capabilities and attracting external users. Learn about current system specifications, capabilities, existing programs, and future research areas.
E N D
High Performance Computing atUCF Brian Goldiez, Ph.D. bgoldiez@ist.ucf.edu September, 2008
Background • Directed Federal Program • Funded for 2 years • Maximize Campus Participation • Competitive Procurement (7 Bids) • IBM Selected • Machine Named ‘STOKES’ • After Sir George Gabriel Stokes (Mathematician & Physicist) • Participate in the HPC Community • SURAgrid • Supercomputer Conference
UCF Objectives • Support Scientific Exploration & Interaction • Science Based M&S • Human Centered M&S • Synergies Between the Above • Build a Diverse Community of Users • Increase System Capabilities • Increase Research Scope & Funding • Attract External Faculty & Users • Become Self Sufficient in 2010
Current Management Approach • Research Computing is a Specialize Field • Research Computing Needs to be Professionally Managed (e.g. GSU, Purdue) • We Have Some Unique Opportunities • Interaction in HPC • Real Time Storm Effects on Coastal Areas • Crowd Modeling • Games (Serious & Entertainment) • UCF Can Become a Major Player and Be Viable for Funding • Recommendations: • UCF Centrally Facilitate/Manage Research Computing for Improved Efficiency & Use of Resources • UCF Designate a Person to Become Active in SURA HPC Group & Work With Campus Entities on Rsch Computing • Use Existing Grant Resources to Fund the Initial Effort • Plan for University & External Support and Growth over the Next 3 Years
Current System (90% Utilized) Processor, Xeon 3 GHz, 64b ~2.2 Tflops 240 Cores 4 Visualization Nodes 528 GB Memory 22+ TB Storage O/S RHEL 5.0 Interconnect IB 20Gbps GigE NFS ~220MB/s Expanded System Processor, Xeon 3 GHz, 64b ~6.4 Tflops 648 Cores 4 Visualization Nodes 1.424 TB Memory 42+ TB Storage O/S RHEL 5.1 Interconnect IB 20Gbps GigE GPFS w/RDMA ~500 MB/s Stokes Current Capabilities
Science Based M&S Usage Nano Technology Civil Engineering Physics Batch Processing Existing Programs (e.g., MatLab) New Data Large Runs Segue to Larger Systems Human Centered M&S Usage IST Army Partnering Industry Interactive Human in the Loop Modeling Human Activity Multi-modal I/O Multi-user No Existing HPC Programs or Data Usage Groupings
Interactive Simulation • Needs • Real time capability using fast processors and high-speed interconnects • High fidelity • Low latency/High bandwidth interconnects • Real time I/O • Connection to real world assets • Fixed frame rates (some apps) • Strategies • Message Passing Interface (MPI) or Scalable Link Interface (SLI) • Ltd shared memory processing (SMP) or distributed processing • Interfaces with sensory processors (e.g., interactive visualization, haptics, …) • Scalability in terms of HPC architecture and simulation entities
Other Considerations • Let’s remember the ‘human factor’ • How will a user interact with an HPC? • How will multiple users interact with an HPC & maintain coherence of I/O? • How will interim results be gathered? • How can timely and relevant HF experiments be developed to influence the design? • Get developers involved…
IST Physics Mathematics Chemistry Nanoscience Civil Engr Mech. Engr Industrial Engr Electrical & Computer Engr CREOL SAIC Forterra Current Users
Current Human Centered M&S Research • Apparent Parallelizable Systems (SAF/Games) • Approaches to Parallelization • Spatial & Temporal Coherency • Performance Assessment & Optimization • Interactive & Visualization • Review Lit in Sci Vis & Comp Steering • Leverage Existing Software (e.g., OLIVE, DCV) • Consider & Baseline Different Approaches • LVC Modeling
Possible Areas of Future Research • Multi-core Programming for M&S Applications • Tight Timing Constraints • Low Latency • I/O Bound • Use of Cell Processor for M&S • Multi-World Systems • LVC Implementations/Experimentation • Terrain Correlation • Granular Propagation Mitigation Methods • Multi-scale Simulations • Benchmarks • De-coupling SAF Models • ????
Getting Involved(Notional for Discussion) • Relevance to UCF Interests • UCF M&S (Fully Supported) • Other UCF (Partially Supported) • Other Entities (Profit and Non-Profit) • With UCF M&S (Fully Supported) • With Other UCF (Partially Supported) • Other Users (Lower Queue Priority) • University/Non-Profits (Case by Case) • For Profit Proprietary • Provide Funds for Staff • Constrained Use of Software • Joint Proposals
Issues • Facilities • Power & Cooling Infrastructure • Obsolescence • Parallel Programming • Long Term Support • State Funding? • Other Sources?
High Performance Computing for SimulationTraining Systems • Purpose • Enhance the University’s facilities in the area of HPCC systems • Support faculty research for parallel simulation of complex scientific data in the areas of Physics, Chemistry, Civil and Nano-technology • Study large scale interactive simulations that require real-time processing of hundreds of entities on complex terrain databases • Support RDECOM research on gaming and training system development such as OneSAF • Benefits to the Army • Establish a capability to address M&S relevant issues in Multi-scale simulation, interactivity and visualization. • Offer a unique opportunity to synthesize the research efforts of the various departments at the University by facilitating a shared high performance computing infrastructure • Federal and Private Endorsements • Project funded and supported by RDECOM and PEO-STRI • Association with national super-computing grids such as Southeastern Universities Research Association (SURA) • Collaboration with private companies like Forterra systems • Deliverables • HPCC computing platform with quad-core processors, 4GB memory, 10 TB storage, high-speed interconnect and graphics capabilities. • Scientific studies on using HPC in interactive M&S
The Top 10 MachinesNovember 2007 Rmax is in TeraFLOPS = One Trillion (1012) Floating Point Operations per second
Some perspective: Computing Power and CapabilitiesThe Hans Moravec vision
Areas for Investigation • Extents of single image environments • Terrain/Environment • Interacting entities • Live, virtual, constructive experimentation • Scalable simulations • Multi-scale simulations • Control of propagating granularity • HPC architectures for interaction • Map HPC types to applications • Techniques for porting interactive applications to HPC platforms • Tools for interaction