570 likes | 652 Views
GridCC: Real-time Instrumentations Grids A real-time interactive GRID to integrate instruments, computational and information resources widely spread on a fast WAN Francesco Lelli Istituto Nazionale di Fisica Nucleare Laboratori Nazionali di Legnaro, Legnaro Italy. Overview.
E N D
GridCC: Real-time Instrumentations Grids A real-time interactive GRID to integrate instruments, computational and information resources widely spread on a fast WANFrancesco LelliIstituto Nazionale di Fisica NucleareLaboratori Nazionali di Legnaro, Legnaro Italy
Overview • The GridCC Project: Introduction • Bringing Instrument into the Grid: the Instrument Element • The GridCC Test-bed: Pilot application • Instrument Instrumentation • Fast Instrument Communication Channel • Standard Grid Interaction • Current Implementation performance analysis
It is a 3 years project. Started the 1st September 04 Funded by EU in the Frame Program 6 10 Partners from 3 EU Countries + (Israel) About 40 people engagged www.gridcc.org General on the GridCC Project
The Grid Technologies to extend the limit of a single computer (center) Storage Element Computing Element Grid Gateway Grid Technologies Computing Element User Interface Computing Element
Extending the Grid Concepts Grid Gateway Terrestrial probes to monitor The volcano activities Grid Technologies Satellite views to monitor the volcano To model calculations and disaster predictions Control and Monitor Room
The GridCC Project Data for Model Calculations Predictions + Instruments Grid Computational Grid GridCC Project
Instrument Element: global scenario Instrument Element Instrument Element Instrument Element Virtual Control Room Virtual Control Room Computing Element Computing Element Computing Element Storage Elements Storage Elements Storage Element Web Service Interface Exec. Service WfMS WMS User direct Action AgrS Indirect Action Existing Grid Infrastructures
The GridCC Architecture Direct access to IE SE (and CE) possible but often not desirable Storage Element (SE) Storage Element (SE) Storage Element (SE) Compute element (CE) Compute element (CE) Compute element (CE) Virtual Control Room (VCR) All end user access is via the VCR Virtual Control Room (VCR) Collaborative Services (CS) The IE is a virtualization of the real physical instrument Users generally not working alone Information and Monitoring Services (IMS) “Fast” all pervasive messaging system Instrument elements (IE) Execution Services Information System (IS) Instrument elements (IE) Slowly updating information Compute and Storage Elements (with advanced reservation) Of course there may be many IEs Instrument elements (IE) Security is essential to the success of the project Of course Many CEs and SEs Watching (via the IMS) for problems anywhere in the system and acting to resolve them. More complex workflows, including advanced reservation and QoS guarantees , allowed Security Services Global Problem Solver
IE Requirements Storage Element Computing Element Web Services Instrument Element W Any Protocol or physical connection Grid Sensor Network Instrument E D Computing Element Instrument Element F A C B Instrument 1: Provide a uniform access to the physical device • 2: Allow a standard grid access to the instruments 3: Allow the cooperation between different instruments that belong to different VOs
Instrument Element: a Black Box • Quick Answers to the previous slide: • The VIGS provide the a uniform instrument instrumentation way • The fast communication channel disseminate the acquired information between instruments • The Data Mover provide a standard Grid Interface in order to be accessed by others Grids components like the SE and the CE Fast communication channel IE Instrument Grid Interaction Data Mover VIGS Instrumentation Instruments • The term Instrument Element describes a set of services that provide the needed • interface and implementation that enables the remote control and monitoring of physical instruments. IE Key Developers:E. Frizziero1, M. Gulmini1,3, F. Lelli1,2 ,G. Maron1,A. Oh3, A. Petrucci1, S. Squizzato1, S. Traldi1 1 Istituto Nazionale di Fisica Nucleare, Laboratori Nazionali di Legnaro 2 Dipartimento di Informatica, Università Ca’ Foscari di Venezia 3 CERN European Organization for Nuclear Research
Device Virtualization Model Voltmeter • Parameters: Maximum Voltage, Minimum voltage • Attributes: measured Voltage • Commands: Perform a measure • Parameters hold configuration information • Attributes hold instrument variables • Control Model hold actions • XML Based Language to allow the device to describe itself Instrument Parameters Attributes Control Model XML Based Language
Instrument Instrumentation • Crucial non-Functional Requirements: • Instruments could be order of 106 • Only authorized people should access to the instruments of a VO • The instrumentation is not a batch process like a job submission! Interactivity is mandatory IE lockInstruments unlokInstruments retrieveLoked getContexts getInstrumentManagers getInfo getIstance get/Set Parameters getCommands executeCommand getState getStateMachine getRemoteExecutionTime getOneWayCost getTotalMethodExecutionTime VIGS • A Distribute and hierarchic implementation is mandatory • the Security overhead should be negligible Instruments • We can divide the Instrumentation in 3 main parts: • The direct access to the Instruments • The advance instrument reservation (interaction with the Agreement Service (AS)) in order to achieve (hard) guarantees • The Possibility to predict the execution time of the instrumentation methods in a concurrent access (soft guarantees) Instrumentation method Documentation http://sadgw.lnl.infn.it:2002/IEFacade
Instrument Element Architecture Access Control Manager Data Flow State Flow Error Flow Monitor Flow Control Flow Virtual Instrument Grid Service (VIGS) • The term Instrument Element describes a set of services that provide the needed interface and implementation that enables the remote control and monitoring of physical instruments. Instrument Element create() Inf & Mon Service Resource Service Problem Solver destroy() execute() Data Mover getState() Instrument Manager IMS Proxy Control Manager Data Collector Control Manager Event Processor FSM Engine Input Manager Resource Proxy Real Instruments
Instrument Element Implementations Instrument Element Inf & Mon Service Resource Service Problem Solver Access Control Manager Data Mover Instrument Manager • The IE components are typically implemented into a fully equipped Machines (e.g. dual core cpus, large memory, large disks, etc). This is true for RS, IMS and PS. For IM (and DM) there are 2 possibilities, according to the application type: • IM implemented in a fully equipped machine • IM embedded into the instrument that should be controlled IMS RS IM IM IM IM Embedded Web Service
Instrument Manager Instruments Instrument Manager Control Manager IMS Proxy Event Processor FSM Engine Data Collector Input Manager Resource Proxy Plug-in modules to interface to the instruments Customizable Control Flow Data Flow Monitor Flow State Flow Error Flow • IM is composed by 3 main components: • - Control Manager: • - Input Manager. It handles all the input events of the IM. These includes commands from GUIs or other IMs, • errors/state/log/monitor messages. • - Event Processor. It handles all the incoming message and decide where to send them. It has processing capability • - FSM. A finite state machine is implemented • - Resource Proxy. It handles all the outgoing connections with the resources. • - Data Collector. It get data from the controlled instruments and make them available to the data mover. A local storage of the data • is even foreseen. • - IMS Proxy. It receives error/state/log/monitor information from the controlled resources and forward them to IMS
Resource Service Architecture Partition/Configuration retrieve methods Discovery Manager Available Resources Partition and Lock setting methods Subscribe Manager • The Resource Service (RS) handles all the resources of an IE and manages their partition (if any). • A resource can be any hardware or software component involved in the IE (instruments, Instrument Managers, IMS components) • RS stores the configuration data of the resources and download them to resource target when necessary • Resources can be discovered, allocated and queried. • It is the responsibility of the RS to check resource availability and contention with other active partitions when a resource is allocated for use. • A periodic scan of the registered resources keeps the configuration database up to date. • RS is interfaced to the WMS Partition Definitions RS Data Bases Configuration setting methods Partition&Lock Manager Configuration Definitions Discovery methods Configuration Manager
Information and Monitor System (IMS) Instruments Instruments Instruments Instrument Manager Instrument Manager Instrument Manager • The Information and Monitor Service (IMS) collects messages and monitor data coming from GRID resources and supporting services and stores them in a database. There are several types of messages collected from the sub-systems. The messages are catalogued according to their type, severity level and timestamp. Data can be provided in numeric formats, histograms, tables and other forms. • The IMS collects and organizes the incoming information in a database and publishes it to subscribers. These subscribers can register for specific messages categorized by a number of selection criteria, such as timestamp, information source and severity level. Errors Log info Monitor State PUBLISHERS (Instruments nodes) SUBSCRIBERS
Problem Solver Instrument Manager Instrument Manager Instrument Manager Instrument Manager IMS Proxy IMS Proxy IMS Proxy IMS Proxy Control Manager Control Manager Control Manager Control Manager Step 3 On-line information can be analyzed in order to detect possible malfunctions Step 1 The control manager can perform an autonomous recovery action where the cost for the determination it is not so heavy . Problem Solver Pub/Sub On Line Analisys DB Data Mining Tools State Flow Error Flow Monitor Flow Algorithms evaluations : Rule Induction, Tree, Functions, Lazy, Clusters and Associative Step 2 Persistent information can be analyzed in order to extract knowledge
Poviding QoS over Web Sevices Performing a remote method Invocation in a given amount of time: Deserialization Serialization Transmission t2 t0 t1 t3 Operation execution • Avg =f(Cpu, Inputsize, Outputsize, Algorithm, Key-Factor, net) • SDev =F(Cpu, Inputsize, Outputsize, Algorithm, Key-Factor, net) Processing Deserialization Transmission Serialization t8 t7 t6 t5 t4 Client side Network Service side Crucial Times are: t3-t0 One Way Cost t4-t0 Remote Execution Cost t7-t0 Total Method Execution Cost Cpu = machine HD + machine load (client and server side) Algorithm = method semantic Net = bandwidth + RTT Key-Factor = input value that change the method semantic Inputsize, Outputsize=effective type and dimension
Virtualization of Real devices Web Cam Position Max Value Video Streaming Temperature linked linked IE create() IM Cam IM Sensor destroy() Data for Model Calculations execute() Data Mover Inf & Mon Service Resource Service getState() Predictions Each IM Represent the virtualization of a device min Value Unlinked Unlinked
Virtualization of Real devices (I) IM Cam IM Sensor Each IM Represent the virtualization of a device Web Cam Position Max Value min Value Video Streaming Temperature linked Unlinked linked Unlinked IE create() destroy() IM Master Controller Data for Model Calculations execute() Data Mover Inf & Mon Service Resource Service getState() Predictions
Virtualization of Real devices (II) Web Cam Position Video Streaming Max Value min Value Temperature linked Unlinked linked Unlinked IE Cam IM Cam Data Mover R S IMS IE Sensor IMSensor Data Mover Each Instrument is virtualized and a 3° IE use this others IE in order to accomplish a complex functionality R S IMS Data for Model Calculations IE Master IM Master Controller Data Mover Predictions R S IMS
Virtualization of Real devices (III) Web Cam Position Max Value min Value Video Streaming Temperature linked Unlinked linked Unlinked IE create() Sensor Proxy Cam Proxy IM Master Controller destroy() Data for Model Calculations execute() Data Mover Inf & Mon Service Resource Service getState() Predictions
Message Oriented Middleware Topic A • Subscribers Subscribe to a given Topic/Queue with a subscribe condition • Publisher publish message in asynchronous in a given Topic/Queue way with a given message condition • Publisher and subscribers can be part of the same program or in WAN distributed machines • JMS Provide a standard set of API that standardize this communication system • Many Commercial and academic implementation of this API exist in both C/C++ and Java (NaradaBrokering, Sun, IBM, SonicMQ etc etc ) Topic B Queue Q In Our Case: • Each instrument can be a data publisher or a data consumer • For more demanding application an instrument must send/receive data in a streaming way
RMM-JMS • RMM-JMS is a JMS implementation on top of our high performance Reliable Multicast Messaging (RMM) layer which provides one-to-one, one-to-many data delivery or many-to-many data exchange, in a message-oriented middleware point-to-point or publish/subscribe fashion • The exceptional performance supports remote and distributed control and operation of scientific instruments such as sensors and probes • Multicast transport for publish/subscribe messaging: Supporting the JMS Topic-based messaging and API, with matching done at the IP multicast level. The transport is a Nack-based reliable multicast protocol. • Direct (broker- less) unicast for point-to-point messaging: JMS Queues are implemented over RMM queues. The transport is the TCP protocol. • Brokered unicast transport for publish/subscribe messaging. The broker receives messages from the producer in either unicast or multicast delivery mode, and sends the messages to the subscribers in either mode • broker serves as a bridge in a LAN-WAN-LAN configuration Main Contribution of IBM Haifa Research Lab (Israel)
Performance: message rate – the many-to-one (a) rate - msg size 1000 bytes (b) rate - msg size 100000 bytes 100000 1000 90000 900 80000 800 70000 700 60000 min 600 min Max Max msg/sec msg/sec 50000 500 Avg Avg 40000 400 SDev SDev 30000 300 20000 200 10000 100 0 0 0 5 10 15 20 0 5 10 15 20 Number of Publishers Number of Publishers • Blade center with 12 CPUs and 1GB Ethernet switch • No message loss • Total throughput: 61MBytes/sec. and 67MBytes/sec. for (a) and (b) respectively
Performance: message rate – the one-to-many Rate, msg size 1000 bytes Rate, msg size 1 Byte 600000 90000 80000 500000 70000 60000 min 400000 min 50000 Max msg/sec Max msg/sec 300000 Avg 40000 Avg SDev SDev 30000 200000 20000 10000 100000 0 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Number of Subscribers Number of Subscribers • Blade center with 12 CPUs and 1GB Ethernet switch • No message loss • Peak result of over than 400000 msg/sec. was reached
Performance: round trip time (RTT, Latency) RTT 100 10 Avg Time (mSec) 1 Sdev Ping 0.1 0.01 1 10 100 1000 10000 100000 1000000 Messages Size • Two machines with a single publisher and a single subscriber on each one • Average round trip time computed over 1000 samples
Data Mover Data Collector Data Collector Data Collector IM IM IM IE Web Service Interface: get_data() Data Mover SRM interface Instrument Resources • The task of this element is to get data from the “data collector” of the IM • Data can be accessed via: • Web service interface for generic data dump (e.g. slow storage, spy stream, etc.) • grid storage element (SE) and available CEs can access to the data via an SRM Interface • Http server and TCP communication for high performance had-hoc data transfer • The Data Mover exposes its methods to the IE web service and can be instrumented itself as an instrument. Http Server and TCP/IP raw socket
Instrument Manager Performances (II) 1 + 2 1 2 3 IM with CMS Instruments 1 3 1 3 Optimized environment
DB Pub/Sub (JMS) TCP/IP IMS Performances Web Service Interface IMS IMS Proxy IMS Proxy …. IMS Proxy Errors/log/states messages (xml and java objs)
Main IE Pilot Applications: Power Grid Power Grid V.O Virtual Control Room Virtual Control Room Solar ... Gas Instrument Manager Instrument Element
Main GridCC Pilot Applications: Control and Monitor of high energy experiments
Main GridCC Pilot Applications: Control and Monitor of high energy experiments
O(104 ) distributed Objects to control configure monitor On-line diagnostics and problem solving capability Highly interactive system (human reaction time - fraction of second) World Wide distributed monitor and control The CMS Data Acquisition 2 107 electronics channels 40 MHz 100 Hz
CMS Prototype: IEs at work DAQ Trigger TTS FilterFarm FedBuilder RuBuilder CMS Instrument Elements • GridCC middleware used for CMS MTCC (Magnet Test and Cosmic Challenge) • - 11 Instrument Elements with a • hierarchical topology • - Instruments are in these case Linux • hosts where the cms on-line software • is running • - More than 100 controlled hosts • - 25 days to the start of the data taking ! TOP Det 1 Det 1 Det 1 Detector 8 GTPe 1 DAQ DAQ IE Instrument Managers
IDS Intrusion Detection System Pirated machines Domain A 1. Taking Control "zombies" Pirated machines Domain B Target domain X
IDS Intrusion Detection System A DDoS Attack Domain-wise Sources of the attack Sensor Instrument Element Sensor Instrument Element Sensor Instrument Element Target Domain Sensor Instrument Element Sensor Instrument Element
Main GridCC Pilot Applications: Remote Operation of an Accelerator Elettra Synchrotron
The other GridCC pilot applications • Meteorology (Ensemble Limited Area Forecasting) • Device Farm for the Support of Cooperative Distributed Measurements in Telecommunications and Networking Laboratories • Geo-hazards: Remote Operation of Geophysical Monitoring Network (see first slides) • Medical Devices need a close loop between the data acquisition and the output result
Conclusion • The GridCC project is integrating instrument into traditional computational/storage Grids. • IEs need an high interaction and interactivity between itself and the users. • The GridCC IE implementation is currently installed in heterogeneous applications
Question? • Thx for your time More information: www.gridcc.org On-line Demo at: http://sadgw.lnl.infn.it:2002/IEFacade Acknowledgement: The GridCC project is supported under EU FP6 contract 511382.
Another GridCC applications:Migraine Attacks Treatments EEC 1. Data taking GRID 4. Action 2. Data Processing 3. Result Visualization and control 1 minute loop
The control of the CMS Data Acquisition Storage Element Virtual Cntr. Room Drift Tube CMS Subdetector Supporting Services Diagnostics Computing Element Virtual Cntr. Room • Acquire data from a • CMS Muon chamber • Submit an analysis job • Retrieve the job result • Move data to a Storage Element