330 likes | 341 Views
Explore the research and projects in network measurement by Bruce Lowekamp at the College of William and Mary. Learn about GGF's perspective on network measurement, Grid Monitoring Architecture (GMA), and the DAMED Working Group.
E N D
Update on GGF Measurement Activities Bruce Lowekamp The College of William and Mary
My Research • William and Mary Computer Science Department • 14 Faculty • 45 PhD students • 60 Masters students • lots of good undergraduates • Research • Remos: SNMP-based topology and utilization for distributed apps • Wren: leveraging topology and passive measurements for scalable grid network performation measurement • Optimistic grid computing: fine-grained apps on a grid • GGF Network Measurements Working Group
Passive Network Measurement • When an application is running, use passive measurements. When not, use active probes. • Controlled by monitoring system, knows when measurements needed. • Conversion between measurements important.
Importance of Topology in Grids • No one rule governs performance • Real users (and system owners) make bad choices • Grid applications must optimize performance for these environments. • Can we exploit topology knowledge for better • measurements? • application performance?
Outline • GGF’s perspective on network measurement • GMA: Grid Monitoring Architecture • DAMED: Top-N Events • NMWG: Network Characteristics Hierarchy
R ? R R ? R R R R R R network dispersed users R ? ? R R R R R R R R R R VO-A VO-B GGF Perspective • Users of measurements • Application designers • Runtime system designers • Many users, many environments • Grid applications must be flexible, portable
Information Portability • Information must be portable • Each AS/VO may pick its own measurement system • Parts of network aren’t measured • Different parameters • Goal: Application runs, unaware of environment • Information from multiple measurement systems • Should not have to support 10 different performance models
GGF Projects • GMA: Grid Monitoring Architecture • What components do we need? • DAMED: Discovery And Monitoring Event Description • What are the Top-N events we need to support RIGHT NOW? • NMWG: Network Measurements • What does “bandwidth” mean anyway? • Components of this global information service • Boils down to schemas and protocols
Existing Pieces • Many of these components already exist or are in progress: • instrumentation tools • Pablo (UIUC), NetLogger (LBNL), log4j (apache), web100, SNMP, etc. • host and network sensors • too many to list • sensor management tools • JAMM (LBNL) • event publication service • MDS (Globus), NWS (UCSB), R-GMA (RAL),CODE (NASA AMES), Remos(CMU) • event archive service • netarchd (LBNL), NWS (UCSB) • event analysis and visualization tools • lots, but most only work for specific types of events: • NetLogger nlv (LBNL), Probe (Stazi), Autopilot (UIUC), etc. • BUT, all use different event formats and protocols! • no interoperability
Event Publication • To handle potentially huge amounts of event data requires an event publication and subscription service that is: • flexible • highly scalable • provides near real-time access to monitoring data • The Global Grid Forum (GGF) (www.gridforum.org) has defined the “Grid Monitoring Architecture” (GMA), for this purpose. • Several GMA implementations have started to appear • A great deal of work remains to define standard event schemas and event dictionaries for the GMA.
GMA Terminology and Architecture • (Performance) Event: • Typed collection of data with a specific structure • Producer Interface: • makes performance data (events) available • Consumer Interface: • receives performance data (events) • Directory Service: • supports information publication and discovery • must be distributed and/or replicated
DAMED WG • Discovery And Monitoring Event Description Working Group • Chairs • Jennifer Schopf, ANL • James Magowan, IBM • Top-N Metrics
DAMED Charter • Define a basic set of monitoring event descriptions • information (attributes) associated with a particular data element • conventions for the representation of the value associated with it. • Develop standard representations of the most widely used measurement values (the "top N".) • Emergence of a set of conventions and recommendations that will ease the task of defining richer, domain-specific schemas • Damed if we do • Not everyone will be happy • Damed if we don’t • Never reach our goal of seamless interoperability of grids (one big grid e.g. internet)
DAMED Terminology • Events • Event Target • Event Type • Event Name = Target.Type • network.link • delay.TCP • network.link.delay.TCP
Target Types • Targets used in Top-N Events • Host: IP • Process: IP, PID • Disk Partition: /home • Network Link: IP {port},IP {port} • Software: String • Scheduler: IP, String • Not necessarily hierarchical
Event Types • Top-N • CPU Load • System uptime • Disk size • Disk used • TCP available bandwidth • Ping RTT • Traceroute number of hops • Running software status • Packet Loss • Available Memory • Host Architecture • Host OS • Physical Memory
NMWG • Network Measurements Working Group • Chairs: • Brian Tierney (LBNL) • Bruce Lowekamp (W&M) • Richard Hughes-Jones (Manchester) • Goal: • Portability of network measurements • Steps: • Define hierarchy of measurements • Establish mapping of tools<->measurements • Conversion between measurements of same type
Characteristics Hierarchy • Ultimate Goal: Portability of Measurements • Many APIs • Many tools • Natural Grid Development Process • More measurement systems • More measurement tools • More cooperation • More shared deployed infrastructure • Middleware must be able to determine what network performance information is measuring. • How do we share measurement information without discouraging development of new APIs and tools?
How the Nomenclature Helps • Need to classify measurements • What does it measure? Sometimes more important than how. • Not necessarily a new schema • Should be a good schema for network measurements • Not all systems are/should be organized this way • Can be used as annotation in any schema. • Goal is an agreed-upon classification of measurements taken, to allow both current and future measurement methodologies to classify their observations to maximize their portability.
Representing a Measurement • A measurement is represented by two elements: • Characteristic What is being measured. Bandwidth, latency, etc. • Network Entity The part of the network described by the measurement Link, path, host, etc.
Terminology • Network Characteristics • Intrinsic properties of a portion of the network that are related to its performance and reliability • Measurement Methodologies • Means and methods of measuring those characteristics • Observation • An instance of information obtained by applying a measurement methodology. • Note on IETF IPPM RFC2330 • Compatible where possible, but metrics means many things. • Guiding principle: clear meanings, follow standards where defined.
Network Characteristics • “Intrinsic Property” • Property itself, not an observation • Unrelated to how measurement is made • Not a particular number • Packet Loss • Fraction of traffic • Loss patterns • Traffic profile
Measurement Methodology • Technique for recording or estimating a characteristic • Two approaches: • Raw: measuring actual characteristic • Derived: aggregate or estimate from other characteristics • Round trip delay • ping • TCP transmit/ACK pair • two one-way delay measurements • link propagation and queue length data
Observations • Singleton • Smallest possible observation • Sample • Several singletons together • Statistical • Derived from a sample by calculating a statistic • Timestamps, and ranges, are issues with each observation
Network Entities • Attributes must be included. • Nodes and paths can be physical or functional.
Describing Topology • Two different types of topology • Physical: Actual links and nodes • Functional: Derived closeness • Attributes define the Path or Node • Multiple Topologies are Superimposed over physical network
Describing Topology • Paths: Path data follows from source to destination • Unidirectional in most cases • Paths (including hops) may be made of components • Nodes: Hosts and Internal nodes • Physical and Functional graphs not disjoint at edges
Relationship Between Measurements • Can we develop systems that use whatever information is available? • iperf • pathload • QoS support • Need to be able to request measurement of particular characteristic, without regard to what sub-characteristic or tool is used to return the result. • Convert loss pattern to loss rate. • Traffic profile to utilization fraction.
Characterization of Tools • Goal of hierarchy is to make measurements portable. • First step is to agree on what characteristic tools measure. • Some tools measure multiple characteristics, depending on parameters. • Many lists of tools, including E2EPI, our goal is to annotate these lists and produce hierarchy with multiple views.
NMWG Upcoming Work • Taxonomy is nice, but exchanging real data requires a schema, with values for attributes and parameters. • Two steps: • Map tools to taxonomy • Produce schema • Schema step is needed to reach goal of portability.Participants including DAMED members.
Summary of GGF Activities • Focus on two aspects: • System interoperability • Measurement portability • GMA completed • DAMED finishing up Top-N documents • NMWG characteristics hierarchy near release • Need schema to put components together • Portions contributed by: Jennifer Schopf (ANL)James Magowan (IBM), Brian Tierney (LBNL), and Dan Gunter (LBNL)