190 likes | 333 Views
Building an Information System for a Distributed Testbed. Warren Smith Texas Advanced Computing Center Shava Smallen San Diego Supercomputing Center. FutureGrid Goals. Provide a high-quality distributed testbed High availability, easy to use, quality documentation, knowledgeable support
E N D
Building an Information System for a Distributed Testbed Warren Smith Texas Advanced Computing Center ShavaSmallen San Diego Supercomputing Center
FutureGrid Goals • Provide a high-quality distributed testbed • High availability, easy to use, quality documentation, knowledgeable support • Support a variety of experiments • Cloud, grid, high-performance computing, data intensive • Computer and computational science • Allow rigorous experiments • Data gathering • Support education and training
FutureGrid Overview • Funded by NSF through September, 2014 • PI: Geoffrey Fox of Indiana University • Diverse and distributed set of resources • Compute, storage, network • Connected by high-performance networks • Variety of software environments • OpenStack, Nimbus, Eucalyptus • Torque/Moab, MPI, OpenMP • Hadoop • Software to support experiments • Pre-configured virtual machines images • Performance measurement tools • Experiment execution tools
FutureGrid Deployment NID: Network Impairment Device Dedicated FutureGrid Network XSEDE Network
Motivation • Measure a variety of information useful to users • Resource configuration and load • Software and service descriptions • Resource and service status • Resource usage • Detailed performance monitoring • Provide this information to users in a consistent way
Approach • Use existing monitoring tools • Many good ones to choose from • Integrate the information they provide • Common publishing mechanism • Publish/subscribe messaging • Common storage mechanism • SQL database • Common representation language • JavaScript Object Notation (JSON)
Monitoring Tools • Inca • Periodic user-level tests of software and services • Information Publishing Framework • Static and dynamic information from cluster schedulers and clouds • Represents as GLUE v2 • perfSONAR • All-to-all iperf bandwidth measurements • SNAPP • SNMP network data • Ganglia • Detailed node data • NetLogger • Users to instrument their software and services
Architecture Consumers User Portal Phantom Experiment Recorder User Tools AMQP & JSON PSQL & JSON AdministrativeServers Information Server PostgreSQL Inca Extract Translate Publish Ganglia AMQP & JSON RabbitMQ perfSONAR AMQP & JSON SNAPP Tool-specific Resources Resources Inca Ganglia Information Publishing Framework NetLogger perfSONAR SNAPP
Performance Experiments • Ensure design will meet performance needs before deploying it • See if design could meet the needs of other projects • Emulation environment: Sierra cluster (San Diego) Server Farm (Indiana University) Virtual Machine Cluster Switch Cluster Switch Producer Producer Database 1 Gb/s 10 Gb/s 1 Gb/s Messaging Service Consumer Consumer
Infrastructure Emulation • Emulate FutureGrid and other infrastructures • Characterize the producers and consumers of information
Excess Capacity for User Data FutureGrid x2 OSG x2 FutureGrid OSG XSEDE x2 XSEDE
Current Status • Majority deployed on FutureGrid • SNMP and iperf are the exception • Used behind the scenes • Not yet documented and supported for users
Conclusions • There is a large amount of information of interest to testbed users • Generated by a variety of tools • Providing information in a common way makes it easier to use • Common mechanisms • Common representation language • Current technologies have sufficient performance • RabbitMQ pub/sub messaging • PostgreSQL relational database • Excess pub/sub capacity can be made available to users
Future Work • FutureGrid is wrapping up • Complete and make user-visible for feedback • XSEDE information services • Deployed RabbitMQ • Inca/JSON publishing to RabbitMQ • Information Publishing Framework & GLUEv2