390 likes | 585 Views
LHCOPN perfSONAR MDM status. Otto Kreiter, DANTE LHCOPN, Copenhagen, 17.10.2008. Agenda. MoU status Deployment status Training programme Next steps. MoU status - I. Three MoU’s signed by all parties: SURFnet/SARA RAL DFN/DE-KIT Two MoUs partially signed: RENATER/IN2P3 TW-ASGC
E N D
LHCOPN perfSONAR MDM status Otto Kreiter, DANTE LHCOPN, Copenhagen, 17.10.2008
Agenda • MoU status • Deployment status • Training programme • Next steps
MoU status - I • Three MoU’s signed by all parties: • SURFnet/SARA • RAL • DFN/DE-KIT • Two MoUs partially signed: • RENATER/IN2P3 • TW-ASGC • Gentlemen's agreement: • TRIUMF
MoU status - II • Waiting for last approvals: • CERN • RedIRIS / PIC • In development: • GARR/ CNAF • NORDUNET / NDGF • Gentlemen’s agreement expected after CERN signs MoU: • FERMI • BNL
Deployment status • Only two servers will be deployed per site • Deployment split in three phases (3-4 sites / phase) • Deployment started for three Tier-1 sites (Phase – 1): • SARA • RAL • DE-KIT • The servers were ordered • Deployment initiation meetings with the Tier-1 sites
Deployment Roll Out Steps • Site Survey & Purchase • Site Preparation • Boxes Delivery (Tier-1 Site + GN2 Partner) • HADES (Bee) Server Deployment • MDM (Sun) Server Deployment
Roll Out Schedule Site Survey & Purchase 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Optional + 3 Weeks Depends Site Order Site Preparation Boxes Delivery (GN2 Partner + Site) GN2 Partner Configuration (HADES Box) Shipment to Site Physical Installation and Configuration Validation Physical Installation Preparation, Deployment and Configuration (MDM Box) Configuration Validation weeks
perfSONAR Training (1) • A one-day course. • Will focus on the perfSONAR visualisation tools. • Will provide examples specific to the LHCOPN. • Will include hands-on exercises.
perfSONAR Training (2) • Objectives: • Understand the role of perfSONAR in network monitoring. • Describe the perfSONAR visualisation tools, explaining the purpose of each. • Use the perfSONAR visualisation tools to investigate network issues. • Audience: • Network administrators from the LHCOPN sites.
Next steps • Finalise remaining MoUs/agreements • Deployment stepped up at full speed • Further visualisation and monitoring requirements gathering + enhancements (more later)
Monitoring demo • The alarming and visualisation tools presented are not necessarily in final shape • Your help will be highly valued to shape them • Context of a potential debugging process • Some demos are live other are off-line
The call ! Researcher reports to the network operator that his/her data-flow can not exceed more than 400Mbps between Tier-1 and Tier-0 • He/she can do the next procedure herself • The NOC of the Tier-1 will start investigating – they can use the example procedure
Alarm function - EPCC Florian Scharinger - EPCC University of Edinburgh
User input Hop DNS IP 0 LHC T-1 10.10.10.2 1 iucc.rt1.fra.de.geant2.net 62.40.125.121 2 so-6-2-0.rt1.gen.ch.geant2.net 62.40.112.21 3 LHC T-0 11.11.11.2
Weather map questions • 15/10: -1- OK to enquire with LHC-OPN about providing circuit information to the E2EMon for the to Taiwan (no E2E link) and two links in the US (CBF) (DFN to provide the exact list of circuits). • 09/10: -2- OK to enquire wiht LHC-OPN if the current plans for the weathermap are acceptable. • xx/10: -1- OK to enquire wiht LHC-OPN if the current plans for the alarming are acceptable. • 15/10: -1- OK to enquire with LHC-OPN about providing circuit information to the E2EMon for the to Taiwan (no E2E link) and two links in the US (CBF) (DFN to provide the exact list of circuits). • 15/10: -2- OK to enquire with LHC-OPN about: as the e2e circuit is terminated on the first site equipment (that can be a switch and not a router), would it be possible to get from the OPN a topology for each site showing everything between the terminiation of the E2E circuit and the IP router. • We need the router(s), its IP addresses relevant for the OPN and interface names, the E2E circiut ID. We need to be able to associate the E2E circuit ID with the interface of the router it is connected to. • If VLAN or multiple logical interfaces are used over the circuits between two sites, the VLAN IDs and the their mapping to the router interface should be indicated. • If there are switches between the routers and the logical E2E circuits termination, they should be indicated. The mapping between the E2E circuit ID, the router interface and the site local L2 channel used to link the router to the E2E circuit should be indicated. (e.g. if a VLAN is used, indicate the VLAN ID, the E2E circuit ID it links to which router interface). • We need that information to design our weathermap and the mapping between circuits. • 15/10: -3- OK to enquire with LHC-OPN if the OPN is OK to show the status of the E2E circuit and not of the whole path to the router (e.g. in the case the circuit terminates on a switch and not on the router - see -3-). • Note: We could provide more the information up to the router, but that requires work on our side and on theirs to provide the status of the L2 circuits between the termination of the E2E circuit and the router interface. • 15/10: -4- OK to enquire with LHC-OPN if they plan to do some trunking (multiple 10Gbps circuit put together as a logical circuit with a single IP address on both side). • We would prefer not to as is complexify greatly our task. • 15/10: -5- OK to enquire with LHC-OPN how frequently the mapping between an e2e circuit and a router interface would change. This can involved any of the following changes: move the e2e circuit to another equipment, change the router interface, change the router interface IP address. • In the case there are switches between the termination of the E2E circuit and the router (see -3-), indicates estimate of the change frequency of the mapping between E2E ID, the site L2 circuit ID and the router interfaces. • 15/10: -6- OK to enquire with LHC-OPN if the operational people can update the service desk in a timely fashion if any changes related to their the E2E circuit or the router interface to which the circuit is linked (e.g. circuit terminated on a different interface, new circuit, etc). The SD will update the file accordingly, so that the change can be taken into consideration by the weathermap. • OPN to be notified that as the process is manual, there would be some mismatch for few hours between the infrastructure and the weathermap. • 15/10: -7- Only if -3- indicates that there are some active equipment in between the E2E circuit termination and the router interface. OK to enquire with LHC-OPN if they plan to use internally statis L2 circuits or dynamic ones (e.g. GFRP). We are expecting static methods of circuit provisionning. • 15/10: -8- OK to enquire with OPN if the following is acceptable Decouple the e2e active measurements from the L3 and L2 measuremements. The weathermap would be presented with a single link between two sites, even if there are multiple ones. When clicking on it, the status of each circuit would be presented as well as the utilisation of the IP interfaces, etc. The delay and throughput would as well be presented, but not mapped to a particular circuit. • 15/10: -9- OK to enquire with OPN if they got multiple circuits between two sites, if they plan to have a primary and an unused backup or if they plan to perform load ballancing (and which one: per packet or per flow).
Next steps • General monitoring feedback mechanism • LHCOPN monitoring feedback group • 3-4 tech. people involved in the LHCOPN operations • LHCOPN Portal • Central secure access for all the tools and monitoring information