280 likes | 523 Views
Grids, utility computing and a perspective on the future of IT infrastructure. Washington Area CTO Forum March 31, 2006 Nirav Kapadia nhkapadia@gmail.com. Outline. Characterizing computing grids Grids as intended versus what we see today Common types of grids today
E N D
Grids, utility computingand a perspective onthe future of IT infrastructure Washington Area CTO Forum March 31, 2006 Nirav Kapadia nhkapadia@gmail.com
Outline • Characterizing computing grids • Grids as intended versus what we see today • Common types of grids today • Putting computing grids to work • Types of problems addressed by today’s grids • Operational considerations in deploying a grid • A perspective on the future of IT infrastructure • Cost pressures and technology commoditization • Grid and utility computing: the technology enablers
Grids came about from a need for large scale, collaborative computing • Scale is measured in terms of users, nodes, organizations, geography, and heterogeneity • A grid in the strict sense of the word involves a large number of heterogeneous, shared resources • Collaboration is measured in terms of resource sharing and interoperability • A key characteristic is the ability to manage across organizational boundaries
Broad definition of computing grid Strict definition of computing grid Systems for large scale, collaborative computing must meet key criteria Group A Scalable with users and resources Support for heterogeneity Group B Support for interoperability Scalable with geographical distances Group C Fully distributed (federated) architecture Ability to compartmentalize along organizational boundaries
Many commercial grid solutions only meet the broad definition of a grid • Cluster management systems • Typically harness clusters of dedicated servers • Examples include Platform LSF, Sun Grid Engine • CPU-scavenging “master-slave” applications • Typically take advantage of idle desktop cycles • Examples include SETI@Home, distributed.net
Many commercial grid solutions only meet the broad definition of a grid • Application-specific, custom-built grids • Typically built around a key business function • Examples include Acxiom, Oracle offerings
Today, solutions that meet the strict definition of a grid have to be “built” • Grid solutions based on the Globus toolkit • Several vendors have Globus based offerings • Univa Corp is commercializing Globus • Other grid solutions in academia and research • Most are custom-built and target a specific problem • Typically not appropriate for commercial use (today)
Key takeaways • A grid is a distributed computing system that enables large scale, collaborative computing • Scalable across a large number of diverse and geographically dispersed resources • Many commercial “grid solutions” of today do not meet the strict definition of a grid • Limited ability to manage policies and resources across administrative boundaries
Outline • Characterizing computing grids • Grids as intended versus what we see today • Common types of grids today • Putting computing grids to work • Types of problems addressed by today’s grids • Operational considerations in deploying a grid • A perspective on the future of IT infrastructure • Cost pressures and technology commoditization • Grid and utility computing: the technology enablers
Even today’s grids can benefit users with large scale computing needs • High throughput computing (HTC) • Many independent (non-communicating) tasks • Large problems that break up into manageable, independent tasks • High performance computing (HPC) • Large problem that is not decomposable into manageable, independent tasks
High throughput computing is common in business environments • Large, legacy applications are best served by cluster management systems • Compute-intensive apps are preferable but a mix of compute- and data-intensive apps are manageable • Customizable apps that work on small slices of data work well with CPU-scavenging grids • Apps must be compute-intensive and preferably run within a sandbox
High performance computing isseen more in targeted environments • Applications involving multiple, communicating tasks are typically require custom designed grid environments • Examples include Oracle grid offering and some test beds built with Globus • Other examples include distributed computing platforms such as PVM and MPI
So… you’re ready to deploy a grid computing environment… • As with any other technology, there are several operational considerations… • Resources on the grid – dedicated or shared? • Access management – who needs access to what? • Data management – how does data get to the grid? • Security model employed by the grid
Cluster Mgmt Systems Cluster management systems work best with dedicated resources Condor – from the U of Wisconsin – is a notable exception, but not commercially available CPU-scavenging grids As the name implies, resources are shared – and typically involve desktops A custom screen saver is the most common vehicle for running the grid application Resources on the grid –should they be dedicated or shared?
Cluster Mgmt Systems Option #1: jobs run in a guest account Shared access across jobs Option #2: accounts for everyone on all machines Homogeneous uid pool highly recommended Logins typically disabled CPU-scavenging grids Option #1: jobs run with user’s privileges If downloaded by user Option #2: jobs run in guest account If set up by administrator No direct remote user access to desktop Access management –who needs (gets) access to what?
Cluster Mgmt Systems Transfer user specified files via ftp, scp, etc File staging for large data On demand file transfer (system call traps) Shared file systems CPU-scavenging grids Data embedded within application or retrieved via HTTP/Java call-backs Limited data, typically no files Data management –how does data get to the apps?
Access management (capability control) Opportunities for subversion distributed.net, SETI@Home, etc Globus Java, PCCs Condor LSF, PBS, SGE Unix Ideal Grid Security model –user accountability is key today Custom Applications Source Code Modifications Object Code Modifications Basic system and kernel safeguards Unchanged Binaries Application Executable Application Generation Application Users Run Time Environment
Key takeaways • Today’s commercially available grid solutions primarily target high throughput computing • Cluster management systems and CPU-scavenging grids are the most common • Carefully consider the policy implications of grids in terms of access and data management • More of a concern for grids that span sub-nets or fire walls
Outline • Characterizing computing grids • Grids as intended versus what we see today • Common types of grids today • Putting computing grids to work • Types of problems addressed by today’s grids • Operational considerations in deploying a grid • A perspective on the future of IT infrastructure • Cost pressures and technology commoditization • Grid and utility computing: the technology enablers
Even as grids take hold, theIT landscape is changing rapidly… • Technology is rapidly being commoditized • Businesses are more willing and able to shop for IT services • In-house IT infrastructure is increasingly seen as complex and rigid © Harvard Business Review
IT infrastructure is already a commodity from a business view • Outsourcing is pervasive; and standards-based, open systems are increasingly common • Cost pressures will continue driving businesses to streamline IT infrastructure • More often than not, customized in-house IT systems stand out for their cost and complexity • Common off-the-shelf solutions provide more value in the absence of direct competitive advantage
In time, economics will drive IT infrastructure out of the enterprise • The technology enablers for this paradigm exist today, but are still nascent • (True) grids offer a way to manage computing resources across organizational boundaries • Utility computing solutions bring together grids, data center automation, and virtualization
The technology implications of these changes are enormous • Computing infrastructure needs to become transparent to end users • Users only interact with applications and data • Policy management needs to be decoupled from system management • Cannot assume users can be held accountable • Components of computing systems need to be less tightly coupled • CPU, OS, data, apps may all be in different, remote locations
A utility computing test bed at Purdue showcases this paradigm • Operating since 1995; now a joint development effort between Purdue and U of Florida • By 2001, allowed 3,000+ users from 30 countries to run ~100 applications in a utility environment • Extensively validated: ~400,000 runs (by 2001); highly peaked usage profile • Powers online simulations in the nanoHUB.org portal for the nanotechnology community
Physical Machine Virtual Machine Real users and real usage >10,687 users Condor-G Globus TeraGrid Cluster nanoHUB.org – remote access to simulators and compute power nanoHUB infrastructure Internet nanoHUB.org Web site Remote desktop (VNC) NMI Cluster Slide courtesy of Gerhard Klimeck, Network for Computational Nanotechnology
Custom computing environment assembled in real time Web Portal Application Repositories OS Repositories Data Vaults CPU Farms Inside nanoHUB.org Local Services Utility Services PUNCH Virtual Machine
In conclusion… • Today’s commercially available grids provide a valuable but narrow service • More efficient computing in a closed environment; limited support for cross-organizational sharing • In time, grid and utility computing technologies will move IT infrastructure out of the enterprise • Virtualization and data center automation products are visible precursors
Questions? Comments? Email: nhkapadia@gmail.com