1 / 22

Stern Center for Research Computing Update

Stern Center for Research Computing Update. Norman White February 24, 2005. Outline of talk. Background Current Status and Plans Feedback from faculty How to submit jobs to the grid Demo of grid engine for those interested. Background. Stern Research Computing

yamal
Download Presentation

Stern Center for Research Computing Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stern Center for Research Computing Update Norman White February 24, 2005

  2. Outline of talk • Background • Current Status and Plans • Feedback from faculty • How to submit jobs to the grid • Demo of grid engine for those interested

  3. Background • Stern Research Computing • Research Computing has had little attention since Stern signed the WRDS agreement. • Several neglected areas • Computational intensive research • Wharton (WRDS) not really appropriate • Eureka very slow • Desktop not appropriate • Rapidly growing demand • Desktop computing • Faculty offices becoming mini computer centers • Software Licensing Issues

  4. Initial Response • Center for Digital Economy Research • Citigroup grant for small cluster (grid) • Salomon Center • Establishes a small staff and facilities for Financial data bases • Collaboration Between Salomon Center and CEDER • Equipment Consolidation in Copy Center • Stern Center for Research Computing Established

  5. CRC Mission • Foster and support computational based research at Stern • Provide Stern with the ability to do cutting edge research • Leverage Stern’s Scale and Scope

  6. Immediate Goals (now completed) • Consolidate existing Research Computing Facilities • Provide immediate improvement in capabilities (processing, disk, software, backups) • Establish a research computing architecture which integrates existing and new hardware • Develop platform for continued improvement • Provide incentives for faculty to participate • Support PhD Research

  7. Medium Term goals • Extend architecture to include • Stern Desktop support • Computation nodes • Data access from desktops • Labs • University facilities • Super computer on order • Provide programming support

  8. The “Team” • Faculty Director – Norman White • “Virtual Team” • Scott Joens – IT and Salomon Center • David Frederick – IT • Dan Graham – IT • Vadim Barkalov – Student • …..

  9. Current Status • Hardware • Cluster of machines in Copy Center ( ~15 Eurekas) • GRID – Primary research computer • Available to all researchers (1.5 times Eureka) • Main host for the rest of the machines • Sun Grid Engine Master Host • LEDA – 8 Processor Linux • HPC only (Matlab, …) • 5 times as powerful as Eureka • Miner • HPC Only (Matlab, Splus, R, Octave) • Total processing power >10 times Eureka • High speed gigabit network backbone • Gigabit connection to rest of Stern • Dedicated Tape backup unit for research computing

  10. Software • Sun Grid Engine running on 2 machines • Soon to be rolled out to all machines • Matlab license server with 28 licenses • Can run on any node, Sun or Linux • SAS • Sun (Grid) only • Splus • Sun and Linux • Stata • Linux • Cplex, GAUSS, Mathematica, R, Octave, Perl, f77, C,Java … • Pine, Pico, emacs

  11. User files • All user home directory files are available on any node. • Networked data storage available on all nodes (~ 1TB in total, more coming) • Home directories backed up every night. • Data once per week.

  12. Grid Computing … • Concept • View machines as computing nodes • High speed network connecting machines in a cluster together • Support for heterogeneous nodes • Speed • OS (Solaris, Linux) • Software (SAS, Matlab) • Disk (need > 4GB) • Memory (> 256MB) • 3 types of host machines • Submit Host • Scheduling Host (knows what nodes have what resources) • Execution host

  13. Advantages of Grid Computing • Grid Scheduler has intelligence • Knows load on all hosts • Knows hosts resources • Knows availability of hosts • Allows dynamic addition of nodes • Execution hosts can die and grid is unaffected • Understands grid-wide resources (like software licenses) • Provides an architecture for continuous growth

  14. Who can use the “Stern HPC Grid” • Any researcher who needs to run jobs > 1 hour of cpu • Most users have been migrated (even though you don’t know it) • All large jobs will HAVE to run on the grid, unless there is some compelling reason not to.

  15. How do I use the “grid” • You need to create a small shell file to run your job. • In the shell file, you tell Sun Grid Engine about your job so it can decide where to run it. • At a minimum you give it a name, and how to run your program. • Optionally declare resource needs like • Cpu time (default is 2 hours) • Software (matlab, Splus, Sas, …) • Memory (default is 256MB)\ • …. (many options)

  16. ExampleMatlab job – 100 hours of CPU • #!/bin/sh • #$ -N mymatjob • #$ -l matlab, h_cpu=100:00:00 • matlab<mymatjob.m To submit: qsub mymatjob.sh qstat (will show you the status of all jobs)

  17. So what is happening?? • When you submit your job, the Sun Grid Engine matches your needs against available resources. • It will then choose the “best” machine to process your job on. • I.e. the most lightly loaded machine, that matches your requirements.

  18. Why can’t I just login and run it myself? • How would you know which machine has what resources? • How could you determine the load? • Sun Grid Engine will also: • Load balance across many machines • Deliver your output automatically • Email you when your job is complete • Allow you to have job dependencies • I.e. First run job A, then (in parallel B,C,D), and then E • SGE will manage parallel execution • I.e. Run this job on 7 different matlab nodes in parallel

  19. Advantages • Centralized management of all resources • Graphical interface (Qmon) to manage and view status • (Coming) Web interface for users to submit and monitor jobs.

  20. So what about desktop users?? • Two answers • Is your desktop really the appropriate place to keep your data and do your computing, or are you doing it there because you have to? • New environment should make it more efficient and safe to do your computing on the grid. • If you need a Windows environment, we can still offer • Software installation • Access to consulting • Data storage and backup

  21. Coming this summer • More grid Nodes?? • A Windows Server for expensive research applications (Authorware …) • ??? (What do you need)

  22. Comments?? • What are your needs? • What isn’t covered here?

More Related