450 likes | 664 Views
Sun's Infrastructure Solution for Grid Engine. Jongjun Son Sun Microsystems , korea. http://sun.com/grid. Agenda. Sun's Grid Strategy Sun's Grid Software N1 Grid Engine 6 Technical Overview. Sun's Grid Strategy. Sun's Grid Computing Approach. A flexible and scalable architecture
E N D
Sun's Infrastructure Solution forGrid Engine Jongjun Son Sun Microsystems , korea http://sun.com/grid
Agenda • Sun's Grid Strategy • Sun's Grid Software • N1 Grid Engine 6 • Technical Overview
Sun's Grid Computing Approach • A flexible and scalable architecture • Pools computing resources to solve important problems • Collects unused capacity for better utilization • Architecture for seamless addition of resources • Up to hundreds or thousands of processors and systems • Multi-platform, Multi-OS • Distributed resource management (DRM) • Distributed system and software management A well-designed Grid Computing infrastructure is accessed, used, and managed as a single, unified resource
Supported Platforms N1 Grid Engine Download and try it out free at http://gridengine.sunsource.net:
Compute Elements Sun's End-to-end Product Line • Access systems • Thin clients, workstations • Compute nodes • Linux and Solaris Operating Systems • Compact 1U and 2U servers • Blade servers • Larger symmetric multiprocessing (SMP) systems • Sun Fire Superclusters • Pre-configured Grid Computing rack systems Sun Fire ComputeGrid rack system
Sun Fire Compute Grid Engineered, Tested, Integrated, Supported • Up to 32 Sun Fire V20z, or Up to 10 V40z • Sun Control Station • Sun N1 Grid Engine Software • Upto 2 * 24port Gigabit Ethernet Switches • 48-port Terminal Server • Keyboard/Video/Mouse shelf unit • Sun Rack 1000-38
Software Elements Small to Large Grid Computing Solutions Service Discovery Global Grid Infrastructure OGSA, Globus Toolkit, Authentication/Authorization Avaki Industry Standards and partner technologies Data Management Enterprise Grid Infrastructure N1 Grid Engine Policy Management N1 Grid EngineSolarisTM Resource Manager Resource Management Sun Management CenterSun Control Station System Management Cluster Grid Infrastructure Sun QFS/SamFSSolaris CacheFS Data Access
N1 Grid Engine 6 Distributed Resource Manager, Job scheduling • Policy management • Owners negotiate usage • 4 different, customizablepolicy schemes • Exceptions for specific needs • Benefits • Equitable, enforceablesharing between groups • Alignment of resourceswith business goals
Sun Cluster Grid Manager Unified Remote System and Grid Management • Sun Control Station software • System health and performance monitoring • Pull, push, and automatic provisioning • Deploy both Linux and Solaris x86 images • Integrated grid management module • Manages Sun Grid Engine or Sun Grid Engine, Enterprise Edition • Aggregated Management • Address hundreds of systems individually or groups • Combined system, software, and grid management
A Complete Solution Proven and Repeatable Reference Architectures Control Network (Gigabit Ethernet) Sun Compute Grid rack systems Sun ONE Grid Engine Servers Workstations Sun ClusterGrid Manager Data Network (Gigabit Ethernet) Sun StorEdge storage solutions (Direct-attached, NAS, HA-NFS, HPTC SAN)
Grid Scalability from Local to Global Cluster, Enterprise, and Global Grids Global Grid Cluster Grid Internet Cluster Grid Enterprise Grid Enterprise Grid
N1 Grid Engine 6 Technical Overview
Agenda • N1 Grid Engine Overview • Architecture • Resource, data access • Application Intergration • N1GE6 New feature • Accounting & Reporting
N1 Grid Engine Overview Resource Management Selection of Jobs Simple policies : FIFO,equal share, rank Sophisticated policies:sharing, urgency, priority, deadline,resource-based, etc Selection of Resources System characteristics: CPU,memory, OS, patches, etc. Status of systems: avail. mem,load, free disk space, etc. Status of other resources: licenses,shared storage, other software, etc. Grid Engine # # BLAST # blastall -p blastn -i /nfs/data
N1 Grid Engine Overview Resource Control Control of jobs Suspend, Resume, Kill, Migrate, Restart Customizable action methods Manual or automated via policies Control of resources Regulate load from Grid jobs basedupon resource value thresholds Control access via permissions,time/date, jobtype Allocate systems to jobs based ontotal resource consumption(eg, memory, CPUs, disk, etc) Grid Engine # # BLAST # blastall -p blastn -i /nfs/data
N1 Grid Engine Overview Resource Accounting Accounting of jobs Current resource consumptionalways monitored Total detailed consumption recorded at end of job Includes record of user, department, project, etc, Accounting of resources Current usage of resources onhosts always monitored Information recordedover time: resource utilizationof hosts, grid; grid configurationchanges Grid Engine # # BLAST # blastall -p blastn -i /nfs/data
Exec Host Master Host execd Qmaster Schedd Grid Engine 6 Architecture Access Tier Management Tier Compute Tier Submit Host Admin Host SGE daemons Shadow Host? TCP/IP
Built-in and custom resources • Static resources: strings, numbers, boolean • Countable resources: eg, licenses, MB of memory/disk • Measured resources: value provided through Load Sensor Resources used for • job resource request: job A needs 1 license and 1GB • Load/suspend thresholds: suspend jobs if load_avg > 1.5 • load formulas: send jobs to hosts with least load; out of those, choose hosts with most free memory Resources THE HEART OF GRID ENGINE MANAGEMENT Per Host • load_avg • mem_free • OS/patch-level Global • floating licenses • shared storage
Parallel and CheckpointingEnvironments Environment a set of hosts that is used to support parallel or checkpointing applications applications must inherently support parallel/checkpointing execution H1 H2 H4 H6 H3 H5 H7
Data Access App binaries CONFIGURED INDEPENDENTLY Job data Exec hosts Data Grid File staging NFS sharing
Application Integration Methods General methods Parallel methods Checkpointing methods starter method parallel start queue/host prolog START requeue job parallel stop queue/host epilog migration command resume method suspend method Job run at specified intervals clean command terminate method checkpoint command parallel stop queue/host epilog END
Integrating applications with Grid Engine • Unmodified/legacy application binaries:integrate using wrapper script • Interactive applications: use pluggable remote mechanisms, eg, ssh, rsh, telnettwo most common approaches • Grid-ready applications: modify code touse DRM APIsAPI recently standardized • Java applications: JGrid package for low-level coupling (object/method distribution)currently provided separately
N1GE 6 New FeaturesArchitecture • Berkeley DB spooling • Multi-threaded Master Daemon • New communication system • Scalability goals: N1GE 6 per 1 master • Up to 10,000 unique hosts • Up to 500,000 unique jobs * Array Jobs counted as a single job
N1GE 6 Supporting Platforms end.CY2004
N1GE 6 New FeaturesScheduler Functionality • Advanced planning capabilities • Resource Reservation w/ Backfilling • Can reserve any resource, eg memory, CPU, license • More sophisticated scheduling algorithms • Management policies matched with business priorities: • Priority, urgency, share tree, category, deadline, etc
Lic. Mem. CPU Mem. CPU Time Simple, priority-based scheduling Global Job 4 Job 6 Job 2 Host 2 Wasted resources Job 2 Job 3 Job 2 Job 3 Job 2 Job 3 Job 2 Host 1 Job 6 Job 4 Job 2 Job 1 Job 5 Job 2 Job 4 Job 6 Job 2 Job 5 Job 2 Job 1
Lic. Mem. CPU Mem. CPU Time Scheduling with Resource Reservation Global Job 2 Job 4 Job 6 Host 2 Job 3 Job 2 Job 3 Job 2 Job 2 Job 3 Job 2 Host 1 Job 5 Job 6 Job 2 Job 4 Job 1 Job 2 Job 6 Job 5 Job 2 Job 2 Job 4 Job 1
Lic. Mem. CPU Mem. CPU Time Resource Reservation with backfilling Global Job 6 Job 2 Job 4 Host 2 Job 2 Job 3 Job 6 Job 2 Job 3 Job 2 Job 3 Job 2 Job 6 Host 1 Job 5 Job 2 Job 1 Job 2 Job 4 Job 5 Job 2 Job 2 Job 4 Job 1
Resource Management Policies Resource allocation based upon business priorities • policy basis includes: cumulative utilization, category priority, time-based priority, resource value, etc • powerful, flexible, tunable, easy to configure All jobs High Priority Low Priority Normal Priority Dept A: 70 more rights to high priority jobs Dept B: 30 Dept B: 50 Dept B: 50 Dept A: 50 Group X: temporary boost Dept A: 50
Policies for Job Prioritization Priority determines which pending jobs get dispatched Job priority calculated based on three sub-policies (normalized to 0.0 < N < 1.0): prio = Wurg Nurg + Wtix Ntix + Wpsx Npsx Nurg = normalized Urgency Ntix = normalized Tickets Npsx = normalized Posix W = weighting factors
6.x Cluster Queue 5.x Queue A B C D A B C D ... ... ... ... ... ... ... ... ... ... Cluster Queue Hosts:
N1GE 6 New FeaturesAnalysis / Monitoring / Accounting • Value-add module for doing analysis, monitoring, accounting reports, etc. • Fine-grained resource recording • Stored in RDBMS in well-defined schema • provides built-in capability for reporting, chargeback, etc • Web-based console tool provided for generating reports, queries, etc.
Why 2nd separated DB? • Different access considerations • Standardized access (SQL, ODBC, JDBC) • More powerful database structure • Independent of core system data • historical data • Derived data (sums, averages ...) • queries won't affect system performance • lower requirements on availability
Qmaster Reporting File Reporting-Writer build derived values raw data Reporting-DB Architecture • Reporting-Writer:Java application • loosely coupled to the SGE system via qmaster-generated reporting file • Stores raw data,pre-processed data to SQL-DB via JDBC
Stored Data • Job related information times, user, project, exit status ... • Host and queue related information load information, consumables ... • Sharetree configured shares, actual shares ... • Precomputed, derived values sums, averages per host, queue, user, project ...
ARCo: Accounting and Reporting Console • Web-based tool for displaying data in reporting DB • Based on Sun Web Console • Ability to create simple and advanced (SQL-based) queries • Generates tables, graphs, exportable as CVS, PDF • Also, command-line report generation
jongjun.son@sun.com http://sun.com/grid