470 likes | 710 Views
DGMA 2008 GRID ACCULTURATION. Zaharin Yusoff (Prof. Dr.) President, Multimedia University Eureka Building, USM, Penang 21 st October 2008. Grid Acculturation. Where to begin?. Parallel computing Supercomputing …. Distributed computing High-performance computing …. Grid computing.
E N D
DGMA 2008 GRID ACCULTURATION Zaharin Yusoff (Prof. Dr.) President, Multimedia University Eureka Building, USM, Penang 21st October 2008
Grid Acculturation Where to begin? • Parallel computing • Supercomputing • ….. • Distributed computing • High-performance computing • …. • Grid computing • Utility computing • Cloud computing • (inc. Virtualisation) • …….
Grid Acculturation – where to begin? http://linux.sys-con.com/node/587717.... By: Michael Seehan Jul. 25, 2008 10:15 AM We all know that the term “Cloud Computing” is relatively new to the Technology buzz. But just how new is it? For starters, I ran a quick comparison of “Cloud Computing,” “Grid Computing” and “Utility Computing”. The term Grid Computinghas been around for a while (even before Google Trends tracking shows it). But as you can see from the graphic above, it is trending downwards. Utility Computinghas pretty much remained below the radar in comparison. But, the newcomer Cloud Computing, which made its full entrance into this trend analysis around 2007 is rapidly gaining momentum. 2008 seems to be a pivotal time where it surpassed Grid Computing (and continues to grow).
Grid Acculturation – where to begin? http://hothardware.com/News/Cloud_Computing_The_Future_Takes_Nebulous_Shape/ Monday, June 30, 2008 – by Dave Altavilla …. In the final analysis, there's no question that Cloud Computing, Grid Computing, Utility Computing or whatever else you'd like to call it, is definitely the wave of the future for many applications and usage models. Granted, the average power user or enthusiast will likely still have a powerful desktop or notebook system for many years to come…. “The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do. I can’t think of anything that isn’t cloud computing with all of these announcements. The computer industry is the only industry that is more fashion-driven than women’s fashion. Maybe I’m an idiot, but I have no idea what anyone is talking about. What is it? It’s complete gibberish. It’s insane. When is this idiocy going to stop? …” http://blogs.wsj.com/biztech/2008/09/25/larry-ellisons-brilliant-anti-cloud-computing-rant/ September 25, 2008, 7:53 pm – Larry Ellison’s Brilliant Anti-Cloud Computing Rant
Table of Contents • Introduction • Generalities (… naive…) • At the School of Computer Sciences, USM • Some Attempts at National Initiatives • Centre for Computational Sciences • 8th Malaysia Plan – 20012005 • 9th Malaysia Plan – 20062010 • Some Pertinent Questions • Some Input & Questions • Grid Acculturation
Some Terminologies http://en.wikipedia.org/wiki/Grid_computing Parallel computing is a form of computationin which many instructions are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (‘in parallel’). There are several different forms of parallel computing: bit-level parallelism, instruction-level parallelism, data parallelism, and task parallelism ... ‘Distributed’ or ‘grid’ computing in general is a special type of parallel computing which relies on complete computers (with onboard CPU, storage, power supply, network interface, etc.) connected to a network(private, public or the Internet) by a conventional network interface, such as ethernet.This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local high-speed computerbus … What distinguishes grid computing from typical cluster computing systems is that grids tend to be more loosely coupled, heterogeneous, andgeographically dispersed. Also, while a computing grid may be dedicated to a specialized application, it is often constructed with the aid of general purpose grid software librariesandmiddleware.
Grid Computing – Historical Perspective The term Grid Computing originated in the early 1990s as a metaphor for making computer power as easy to access as an electric power grid in Ian Foster and Carl Kesselmans seminal work, "The Grid: Blueprint for a new computing infrastructure". In Malaysia, grid computing also evolved from parallel computing and distributed computing in the early nineties. Along with these came application domains that need high performance computing, such as computational sciences (e.g. crystallography) and bioinformatics …
GRID COMPUTING Using the resources of many computers in a network at the same time, to solve a single problem (http://www.netnw.net.uk/Jargon_Explained/jargon.htm) …. is a form of distributed computing whereby a "super and virtual computer" is composed of a cluster of networked, loosely-coupled computers, acting in concert to perform very large tasks. http://en.wikipedia.org/wiki/Grid_computing The technology has been applied to computationally-intensive scientific, mathematical, and academic problems through volunteer computing, and it is used in commercial enterprises for such diverse applications as drug discovery, economic forecasting, seismic analysis, and back-office data processing in support of e-commerce and web-services.
School of Computer Sciences, USM • 1993: Parallel & Distributed Processing Research Group • Parallel Computing (1992 1994) • Parallel constructs, • Parallelisation of sequential programs • Distributed Computing (1993 1996) • Distributed databases • Distributed processing • Grid Computing (1997 ..) • Parallel numerical algorithms for message-passing architectures • Cost Effective High Performance PC Cluster with Virtual Shared Memory • Meta-computing Environment for Computational Sciences • e-Science Grid (back-end engine and grid infrastructure) • Knowledge Grid • Seeding Bioinformatics • …. • 1980s • (UM, UKM, UTM, USM, …) • Parallel computing • Computational Science • ……
APPLICATION Genomic & Protemic DEVELOPMENT-BASED RESEARCH Molecular Docking for Drug Design Compute Intensive Grid PC Cluster Liquid Crystal Simulation CURRENT RESEARCH AREAS Peer-2-Peer Parallel Iterative & direct solvers - RKS, NMA Grid Intrusion & Detection System - AS Data Intensive Grid Automatic Parallelization of numerical methods Fast Cryptography Protocols - AS Mobile Agents Resource Allocation - CHY, FH & GCS Parallel & Distributed Processing * Parallelization, Dependency, Aliases - RKS, GCS Resource Monitoring - CHY, FH & GCS Distributed shared Memory (replication & consistency) - RKS & MAO Fault Tolerance (algorithm level & grid level - RA, CHY Parallelising Access to Large Databases (matching, indexing and clustering) – NAR, RA, ZZ R&D&(C) Workshop 28-30July 2003
Applications Replica Catalog e-Science Portal Replica Management Service File Transfer Job Manager Intrusion Detection Service Scheduler Specific plug-in Bill Directory Service E-Science Grid Framework User Mobile Agent Resources Account Manager Resource Usage Tracking Agent R&D&(C) Workshop 28-30July 2003
Overall Architecture User Access Visualization Resource Monitoring) Iterative Solver Agent e-Sciences Grid Portal (Dynamic Information Services) Resource Allocation Processed Data Authorized Prediction Platform Type, Operating System, CPU, Memory, Network, File System, Job Status Resource Monitor Event Publication Information Dispatch Agent Directory Service Mobile Agents Facility Invitation/Correction Event Publication Information Resources
Current Scenario at USM MBBS CLUSTER Math Tsunami Modelling Group School of Computer Sciences, USM stealth OTHER RESEARCH IN USM Digital Content at School of Arts School of Computer Sciences, USMAurora Geographical Information System at School of Humanities CEDEC CEDEC
Proposal for a National Centre for Computational Sciences
National Centre for Computational Sciences 2000: …. Quality Hotel • Inter-university & interdisciplinary Collaboration • About 80 researchers (UM, USM, UKM, UTM, …..) • Computer Science, Chemistry, Biology, … • …. • Ministry of Science • KSU • Science Advisor • ….
8th Malaysia Plan (ICT Sector) – 20012005
KB KB SERVICES SERVICES SERVICES KB KB PROCESSORS PROCESSORS PROCESSORS PROCESSORS RM8 – LAYERS DELIVERY INFRASTRUCTURE
ENHANCEMENT OF MSC FLAGSHIP APPLICATIONS SERVICE STANDARDS & PROTOCOLS SCIENCE, ENGINEERING & TECHNOLOGY (SET) SERVICES E-BUSINESS SMART EDUCATION E-HEALTHCARE • Knowledge & Data Acquisition • Computer-Assisted Education & Training • Teaching-Learning Materials • Business Process Engineering • Personalised Information on Education • Networking of Educators • Student-Educator Consultation • Personalised Lifetime Education Plan • Online Education and Open Learning RM8 – SERVICES(inc. Processors & KBs) ESTABLISHMENT OFSERVICES SET SERVICES • Architecture & Software Infrastructure • System Software & Tools • Applications • Business Engineering • Enablers for E-Business • Sophisticated Processors • Socio-Economic Studies • HCI Tools for E-Business • Wellness Maintenance • Healthcare Practitioner Portal • Healthcare Enterprise Modelling
9th Malaysia Plan (Grid Computing) – 20062010
Grid Computing Domains • APPLICATIONS • Compute-Intensive • Data Intensive • On-Demand • Collaborative • … • SERVICE MANAGEMENT • Managing Users & Applications • Managing lower level technical components • Utility grid management modules • …. • RESOURCE MONITORING • Detecting faults • Managing faults • …. • RESOURCE ALLOCATION • Service Aggregation • Resource Aggregation • Scheduler (load balancing) • … • GRID DATA WAREHOUSE • Grid Database • Data Replication • …. • TOOLS & ENABLERS • Tools for specific applications • Tools for grid construction • Data warehousing • Grid Algorithms • Mobile Agents & Software agents • Grid Protocols • … • GRID SECURITY • Component security • Data security • …. GRID INFOSTRUCTURE GRID INFRASTRUCTURE (incl. Networks)
SET Services compute intensive (bioinformatics, pharmaceuticals, …) Industry data intensive (financial, administrative, …) Middleware Engineering (e.g. aggregator) Knowledge Grid (e.g. e-Science) National Grid Utilities (c.f. TNB, Jaring/TMNet ) ….. Knowledge Dissemination Support …… Computational Resources Users Grid • Scientists • Engineers • Researchers Software Resources Human Data Hardware e.g. High Performance Cluster and etc. Data Source Molecular Database Health Care Information etc Grid Computing Architecture
Satellite Satellite Satellite Satellite Satellite Satellite Center Center Center Center Center Center USM UKM UPM UTM UM Grid Computing Architecture Individual User Company Government Commercialisation or Service to Public A Cluster Centre is inevitable Main Center for Grid Computing Research Cluster others MIMOS
Grid Computing Projects Technology Development • GRID SECURITY • Intrusion & Prevention Detection – 2005 – 2006 • Fast Cryptography – 2006 - 2007 • Data Security – 2006 - 2007 • Identification & Authentication – 2005 -2006 • Authorisation & Policy – 2006 - 2007 • SERVICE MANAGEMENT • Tokens & Metering – 2008 • Negotiator (Agent, AI, etc.) – 2008 – 2009 • Search & Optimization Data Set – 2007 • Service Resource Discovery/Retrieval – 2007 – 2008 • Service Resource Management – 2005 - 2006 • Accounting/Billing/Service Level Agreement – 2005- 2006 • Generic Gateway (Portal) – 2006 – 2007 • Grid Human Computer Interface – 2008 – 2009 • Policy Service Management – 2008 – 2009 • Provisioning (license management, etc) – 2008 – 2009 • GRID INFOSTRUCTURE • National Data Centres – 2005 • Grid Database – 2006 - 2007 • Data Replication – 2006 – 2007 • Grid Storage – 2006 - 2007 • Transaction Management – 2007 - 2008 • Dist. Backup & Recovery – 2005 - 2006 • Parallel Access to Databases – 2007 - 2008 • Content-Based Info Retrieval – 2007 - 2008 • Parallel Data Mining – 2008 - 2009 • Knowledge Engineering – 2008 - 2009 • RESOURCE ALLOCATION • Resource Aggregation – 2006 – 2007 • Service Aggregation – 2007 - 2008 • Scheduler, Meta-Scheduler, Load Balancer – 2006 - 2007 • Resource Reservation – 2007 - 2008 • Trader/Broker – 2007 - 2008 • TOOLS & ENABLERS • Distributed Shared Memory – 2006 – 2007 • Parallel Dependencies, Aliases – 2006 – 2007 • Parallel Iterative &Direct Solvers – 2005 – 2006 • Mobile Agents & Software agents – 2006 – 2007 • Grid Protocols – 2009 – 2010 • Interconnection of Clusters – 2006 – 2007 • Algorithm Analysis – 2006 - 2007 • Search Algorithms (Drug Design) – 2006 – 2007 • Grid S’ware Dev Lib (Numerical, Graphics) – 2006 - 2007 • Connectivity/Comms (Master/Slave, P2P) – 2005 - 2006 • Cluster Node Management – 2006 – 2007 • Grid Simulator – 2005 – 2006 • RESOURCE MONITORING • Fault Management/Tolerance – 2007 – 2008 • Discovery Protocol – 2008 - 2009 • Grid Monitoring Kit – 2006 - 2007 • Grid Sensor – 2005 – 2006 • GRID INFRASTRUCTURE (incl. Networks) • High Speed Grid (MYREN, IPv6) – 2005 • Mobile/Wireless Grid – 2006 – 2007
Generic Grid Portal Middleware Engineering Toolkits National Grid Utility SET / Industry Applications • COLLABORATIVE MULTIPLE INTELLIGENT SCHEDULER • NON-DOMAIN SPECIFIC INTELLIGENT NEGOTIATOR 2010 2010 • OPTIMIZED SERVICE MATCHER WITH CONTENT AWARENESS • ADVANCE DISTRIBUTED KNOWLEDGE BASE • Global Grid • SCHEDULER WITH MULTI-DIMENSION PREDICTION • DOMAIN ADMINISTRATOR • Knowledge Grid • AGGREGRATED MACHINE’S BEHAVIOR PREDICTOR • KNOWLEDGE FILTER 2009 2009 • National Grid • HIGH TRANSPARENCY DISTRIBUTED DATABASE • CLUSTER MACHINE USAGE CLASSIFIER • SERVICE MATCHER • INTELLIGENT COORDINATOR AND COLLABORATOR • SINGLE MACHINE‘S BEHAVIOR PREDICTOR • Bioinformatics Grid • SCHEDULER WITH SINGLE DIMENSION PREDICTION • MULTI-PLATFORM ONTOLOGIES AND DESCRIPTOR 2008 2008 • HETEROGENOUS DISTRIBUTED DATABASE • SERVICE BROWSER • COMPUTE POWER MARKET ii • Financial Grid • MULTI-MACHINE USAGE CLASSIFIER • POLICY ENFORCER • GRID VISUALISATOR • HETEROGENOUS MULTI -ALGORITHM SCHEDULER • ADVANCE INTRUSION DETECTOR • DOMAIN SPECIFIC MULTI-ISSUE INTELLIGENT NEGOTIATOR. 2007 • MULTIPLE STEP AHEAD PREDICTOR 2007 • HETEROGENOUS MONITORING SYSTEM • GRID SERVICE TEMPLATE • FAST CRYPTOGRAPHY • MULTI-CRITERIA SCHEDULER • DATASET FILTER & OPTIMISER • SINGLE MACHINE USAGE CLASSIFIER • Campus Grid • TOKEN MANAGER • ONE STEP AHEAD PREDICTOR • CRYPTOGRAPHY 2006 2006 • HOMOGENEOUS MONITORING SYSTEM • COMMUNICATION PROTOCOL • HIGH THROUGHPUT SCHEDULER • INTRUSION DETECTOR • DOMAIN SPECIFIC NEGOTIATOR • COMPUTATIONAL ECONOMY SCHEDULER Roadmap – Technology Development
Grid Computing Projects Applications Development • Life Science Grid • Bioinformatics • Biotechnology • Medical Grid (e.g. Virtual Anatomy, Virtual Surgery) • Pharmaceuticals (e.g. Genetically Modified Gamat / Tongkat Ali) • Agriculture Grid • Environment • Computational Science Grid • Physics (e.g. Nuclear Applications) • Biology • Chemistry (e.g. Liquid Crystals, Molecular Dynamics) • Mathematics (e.g. Modeling) • Computational Engineering Grid • Volumetric Rendering • Social Science Grid • Culture, Heritage & Civilisation Grid • Commercial Grid • Financial (e.g. Forecasting, Banking) • Multimedia • Oil & Gas • Education • E-Learning • Language • Disaster Mitigation • TARGETS • Clustering • Campus Grid • National Grid • Global Grid • Grid Services Provider • National Grid Utility
Main Points Grid computing is much more than the deployment of hardware and software resulting in a higher performing network. It also includes a culture of sharing, of content as well as computational resources • Another point to look at is whether or not we are asking the appropriate questions of the domain. The goals should not only be of the operational type (such as on efficiency and performance), but also of the functional type. Can there be: • a universal grid operating system, • a grid computing language, • formal criteria for usability, and • grid computing as a utility. Such questions (or goals) would not only lead to the corresponding R&D aspirations but will also open up discussions on very pertinent issues that need to be resolved before any implementation.
Some Input (1/2) • There are a number of R&D areas and questions asked in grid computing: • Traditionally, many researchers conduct investigation on resource management, namely on the issues of scalability, heterogeneity, efficiency, availability and transparency. • For high availability and adaptability, IBM would term these as autonomic computing (for self-healing or for auto-configuration). • For transparency, the term cloud computing is used when viewing resources as services. • Grid is viewed as a body or brawn, while an agent is viewed as a brain. Researchers attempt to meet the brain with the brawn, and many are talking about multi-agent systems on the grid.
Some Input (2/2) • Grid computing can be viewed as a super virtual computer, and researchers explore further on virtualisation techniques such as VMware, Virtual Organisation Management, etc. • In terms of application domains: • Grid combines with pervasive computing to integrate sensor networks, mobile devices, etc. • Many grid researchers collaborate with application domain experts to jointly develop grid applications such as data grid, computational grid, medical grid, e-science grid, eco-grid, rendering farm, financial grid and etc.
SOME PERTINENT QUESTIONS (1/2) Some questions asked many years ago are still valid…: There should be a universal grid 'Operating System' that makes the underlying infrastructure completely transparent – such an infrastructure should be heterogeneous in nature, in terms of hardware and OS, and independent of geographical and logical domains….. (c.f. Globus..)… There should be a 'grid computing language' that rides on the said OS with all the appropriate data structures and programme constructs – such a language should be independent of the infrastructure beneath the OS, but its compiler/interpreter should be intelligent enough to take full advantage of the configurations available.
SOME PERTINENT QUESTIONS (2/2) Some questions asked many years ago are still valid…: …. There should be clear and formal measures/criteria to determine whether a given problem/application is best implemented in a grid environment or otherwise ... (c.f. coarse/fine grain size/ granularity, total cost of ownership, …. But why not something simpler? e.g. 3-phase power..) Grid computing should be made a public utility (as in electricity, water, etc.) -- and with this should come the means for provision of its services, metering and payment like any other public utility.
Attempts at National Initiatives There have been many attempts at making grid computing a national initiative, where some failed while some succeeded to a certain extent but have arguably not met the original goals and intentions. • Perhaps one of the reasons for this limited success is the lack of understanding of the different roles of the players within and those related to the domain …. • Grid Acculturation
Grid Acculturation – Acculturate Who? • OVERALL • Speak the same language • Win-win-win situation • Researchers (& Students) • Fundamentals (incl. abstraction, …) • Synergy (related domains, critical mass, specialise, …) • Incrementality (core, processors, .. , applications, …) • Heterogeneity (multiple platforms, applications, …) • Educate (others & themselves … security, support, …) • Industry • Less confusion (tone down the hype, …) • Longer term perspectives (patience wins, ..) • Fundamentals (e.g. platform dependence kills, …) • Business model • Decision Makers (Government) • No Big Bang theory (need to make informed decisions, ..) • Technology is not cheap (but no need for Father Christmas, ..) • We do not have to be technology consumers
TERIMA KASIH ARIGATO SHUKRIYA XIE-XIE NI KAMSIAH / MMKOI JABAIINAU NGGO BUTE KABU KOP KUN KAH THANK YOU MERCI GRAZZIE GRACIAS SPASIBA DANKE MANGE TAK NAN DHRI
Computer Networks • APPLICATIONS • Multimedia Conferencing • Distributed Systems • (e.g. Digital Libraries) • … • SERVICE MANAGEMENT • Tokens & Charging • Negotiator • …. • COMMUNICATION MANAGEMENT • Resource Aggregation • Services Aggregation • … • NETWORK MONITORING • Intelligent Network Monitoring • Fault Tolerance • Down-time Management • …. • NETWORK SECURITY • Intrusion Detection • Prevention Systems • Cryptography • …. • TOOLS & ENABLERS • Network Operating Systems • Compression/Decompression • Streaming • .. • COMMUNICATION PROTOCOLS • Wired Protocols (e.g. Fast Ethernet) • Wireless Protocols (e.g. Satellite) • Emerging Protocols (IPV6) • ..
SPECIFIC APPLICATION • ORIENTED • Secure game-play • e-Voting • … SECURITY CONFIDENTIALITY TRUST ABUSE ANALYSIS • Digital Signature • Public key • infrastructure • …. • Man-in-the-Middle (MIM) • Dos/DDoS • Virus/Worm, Spam • Drone Armies • Enterprise level security • Agent-Server Security • Radius/Kerberos • Honeypot/Honeynet • Forensics • Enterprise Audit • Enterprise PenTest Enterprise • Appl. Forensics • Appl. Audit • Appl. Pentest • Database security • Web-based Application Security • SSL, SSH • Buffer Overflow • Format String • Client-side (XST,XSS) • SQL Injection • Phising • Biometrics • Smart Card • One time password Applications • Cryptography • (inc. encryption, braid) • steganography • Parallelising • crypto operations • Video/Image security • Packet Spoofing • Cryptanalysis • Brute Force • ISN Predictions • Cache Poisoning • Data Forensics • Log/Alert Analysis • False Positive Reduction • Authentication • Non-repudiation • Integrity • Tripwire Data OS (incl. Drivers & Registeries, H/W Interfaces) • OS Forensics • OS PenTest • Intrusion Detection • Rootkit • Trojan Horse • OS Fingerprinting • Sniffing • Hijacking • Re-routing • Network security • Mobile IPv6 security • Tunneling • …. • IPSec • VPN • Firewall • Intrusion Prevention • Trusted OS PROTECTION Physical Network
BIOINFORMATICS Wet Lab Experimentation (DNA/Genome Sequencing) • SEQUENCE ANALYSIS • Sequence search • Verification • Cleansing • ‘Parsing • Classification • …. DNA / Genome String of Nucleic Acids (A,T,C,G) Amino Acids (V,S,W, .. – 20) • LITERATURE SEARCH • Meaning-based • Literature • Manager • …. Proteins / Peptides Junk DNA / UNKNOWN GENES (NEW !!) • STRUCTURAL ANALYSIS • Modelling • Visualisation • Matching • Comparisons • Simulation • Transformation/ • Mutation • …. • 1&2D3D • TRANSFn • …. • …. Protein/Peptide Database Junk / New !! Database Virtual Experimentations Virtual Experimentations Riding on GRID Protein-Based Applications DNA-Based Applications Dissemination
Input from MMU There is a lot of work on applications for the GRID -- medical, education, etc. Here in MMU, Dr. Ho Sin Ban and his ROs are looking into some of that. Nithiapidary is looking into increasing the efficiency of programs that have many small-scale jobs by grouping them together. Nathar Shah is beginning a study on how to make writing GRID programs less problematic by using Aspect-Oriented programming. There is also research at other universities into making GRID easier to set up -- making installers, security issues, how to promote participation and prevent cheating, etc. Sin Ban, Nithia, and Nathar, do you have anything to add beyond what I wrote above?