340 likes | 516 Views
Investigating Network Performance – A Case Study. Ralph Spencer, Richard Hughes-Jones, Matt Strong and Simon Casey The University of Manchester G2 Technical Workshop, Cambridge, Jan 2006. Very Long Baseline Interferometry. eVLBI – using the Internet for data transfer.
E N D
Investigating Network Performance – A Case Study Ralph Spencer, Richard Hughes-Jones, Matt Strong and Simon Casey The University of Manchester G2 Technical Workshop, Cambridge, Jan 2006
Very Long Baseline Interferometry eVLBI – using the Internet for data transfer
GRS 1915+105: 15 solar mass BH in an X-ray binary: MERLIN observations receding 600 mas = 6000 A.U. at 10 kpc
Sensitivity in Radio Astronomy • Noise level • B=bandwidth, t=integration time. • High sensitivity requires large bandwidths as well as large collecting area e.g Lovell, GBT, Effelsberg, Camb. 32-m • Aperture synthesis needs signals from individual antennas to be correlated together at a central site • Need for interconnection data rates of many Gbit/sec
New Instruments are making the best use of bandwidth: • eMERLIN 30 Gbps • Atacama Large mm Array (ALMA) 120 Gbps • EVLA 120 Gbps • Upgrade to European VLBI: eVLBI 1 Gbps • Square Km Array (SKA) many Tbps
The European VLBI NetworkEVN • Detailed radio imaging uses antenna networks over 100s-1000s km • Currently use disk recording at 512Mb/s (Mk5) • real-time connection allows greater • response • reliability • sensitivity • Need Internet eVLBI
EVN-NREN Gbit link Chalmers University of Technology, Gothenburg OnsalaSweden Gbit link TorunPoland Jodrell BankUK WesterborkNetherlands DedicatedGbit link MERLIN Dwingeloo DWDM link CambridgeUK MedicinaItaly
Testing the Network for eVLBI Aim is to obtain maximum BW compatible with VLBI observing systems in Europe and USA. First sustained data flow tests in Europe: iGRID 2002 24-26 September 2002 Amsterdam Science and Technology Centre (WTCW) The Netherlands “ We hereby challenge the international research community to demonstrate applications that benefit from huge amounts of bandwidth! ”
iGRID2002 Radio Astronomy VLBI Demo. • Web based demonstration sending VLBI data • A controlled stream of UDP packets • 256-500 Mbit/s • production network Man –Superjanet Geant --Amsterdam • Dedicated lambda Amsterdam Dwingeloo
RingBuffer RingBuffer n bytes time Wait time The Works: TCP Control Raid0 Disc Raid0 Disc UDP Data Web Interface
50 bytes UDP Man-UvA Gig 19 May 02 100 bytes 1000 200 bytes 400 bytes 900 600 bytes 800 bytes 800 1000 bytes Recv Wire rate Mbits/s 700 1200 bytes 1472 bytes 600 500 400 300 200 100 0 0 5 10 15 20 25 30 35 40 Transmit Time per frame us UDP Throughput on the Production WAN • Manc-UvA SARA 750 Mbit/s • SJANET4 + Geant + SURFnet • 75% Manchester Access link • Manc-UvA SARA 825 Mbit/s
How do we test the network? • Simple connectivity test from Telescope site to correlator (at JIVE, Dwingeloo, The Netherlands, or MIT Haystack Observatory, Massachusetts) : traceroute, bwctl • Performance of link and end hosts: UDPmon, iPERF • Sustained data tests vlbiUDP (under development) • True eVLBI data from Mk5 recorder: pre-recorded (Disk2Net) or Real Time (Out2Net) Mk 5’s are 1.2 GHz P3’s with Streamstore cards and 8-pack exchangeable disks, 1.3 Tbytes storage. Capable of 1 Gbps continuous recording and playback. Made by Conduant, Haystack design.
WesterborkNetherlands OnsalaSweden EffelsbergGermany ? 1Gb/s 1Gb/s ? ? ?end 06 ? 2* 1G 155Mb/s 1Gb/s JIVE TorunPoland 1Gb/s light now Jodrell BankUK Cambridge UK MERLIN MERLIN MedicinaItaly Telescope connections e
January 2004: Disk buffered eVLBI session: • Three telescopes at 128Mb/s for first eVLBI image • On – Wb fringes at 256Mb/s • April 2004: Three-telescope, real-time eVLBI session. • Fringes at 64Mb/s • First real-time EVN image - 32Mb/s. • September 2004: Four telescope real-time eVLBI • Fringes to Torun and Arecibo • First EVN, eVLBI Science session • January 2005: First “dedicated light-path” eVLBI • ??Gbyte of data from Huygens descent transferred from Australia to JIVE • Data rate ~450Mb/s eVLBI Milestones
20 December 20 2004 • connection of JBO to Manchester by 2 x 1 GE • eVLBI tests between Poland Sweden UK and Netherlands • at 256 Mb/s • February 2005 • TCP and UDP memory – memory tests at rates up to 450 Mb/s • (TCP) and 650 Mb/s (UDP) • Tests showed inconsistencies betweeb Red Hat kernals, • rates of 128 Mb/s only obtained on 10 Feb • Haystack (US) – Onsala (Sweden) runs at 256 Mb/s • 11 March 2005 Science demo • JBO telescope winded off, short run on calibrator source done
Summary of EVN eVLBI tests • Regular tests with eVLBI Mk5 data every ~6 weeks • 128 Mpbs OK, 256 Mpbs often, • 512 Mbps Onsala – Jive occasionally • but not JBO at 512 Mbps – WHY NOT? (NB using Jumbo packets 4470 or 9000 bytes) • Note correlator can cope with large error rates • up to ~ 1 % • but need high throughput for sensitivity • implications for protocols, since throughput on TCP is very sensitive to packet loss.
UDP Throughput Oct-Nov 2003 Manchester-Dwingeloo Production • Throughput vs packet spacing • Manchester: 2.0G Hz Xeon • Dwingeloo: 1.2 GHz PIII • Near wire rate, 950 Mbps • UDPmon • Packet loss • CPU Kernel Load sender • CPU Kernel Load receiver • 4th Year project • Adam Mathews • Steve O’Toole
ESLEA • Packet loss will cause low throughput in TCP/IP • Congestion will result in routers drooping packets: use Switched Light Paths! • Tests with MB-NG network Jan-Jun 05 • JBO connected to JIVE via UKLight in June (thanks to John Graham, UKERNA) • Comparison tests between UKLight connections JBO-JIVE and production (SJ4-Geant)
Project Partners Project Collaborators The Council for the Central Laboratory of the Research Councils £1.1 M, 11.5 FTE Funded by EPSRC GR/T04465/01 www.eslea.uklight.ac.uk
Tests on the UKLight switched light-path Manchester : Dwingeloo • Throughput as a function of inter-packet spacing (2.4 GHz dual Xeon machines) • Packet loss for small packet size • Maximum size packets can reach full line rates with no loss, and there was no re-ordering (plot not shown).
Tests on the production network Manchester : Dwingeloo. • Throughput • Small (0.2%) packet loss was seen • Re-ordering of packets was significant
Dwingeloo DWDM link Jodrell BankUK MedicinaItaly TorunPoland e-VLBI at the GÉANT2 Launch Jun 2005
UDP Performance: 3 Flows on GÉANT • Throughput:5 Hour run 1500 byte MTU • Jodrell: JIVE2.0 GHz dual Xeon – 2.4 GHz dual Xeon670-840 Mbit/s • Medicina (Bologna):JIVE 800 MHz PIII – Mk5 (623)1.2 GHz PIII330 Mbit/s limited by sending PC • Torun:JIVE 2.4 GHz dual Xeon – Mk5 (575)1.2 GHz PIII245-325 Mbit/s limited by security policing (>600Mbit/s 20 Mbit/s) ? • Throughput:50 min period • Period is ~17 min
18 Hour Flows on UKLightJodrell – JIVE, 26 June 2005 • Throughput: • Jodrell: JIVE2.4 GHz dual Xeon – 2.4 GHz dual Xeon960-980 Mbit/s • Traffic through SURFnet • Packet Loss • Only 3 groups with 10-150 lost packets each • No packets lost the rest of the time • Packet re-ordering • None
Recent Results 1: • iGRID 2005 and SC 2005 • Global eVLBI demonstration • Achieved 1.5 Gbps across Atlantic using UKLight • 3 VC-3-13c ~700 Mbps SDH links carrying data across the Atlantic from Onsala, JBO and Westerbork telescopes • 512 Mps K4 – Mk5data from Japan to USA • 512 Mbs Mk5 real time interferometry between Onsala, Westford, Maryland Point antennas correlated at Haystack observatory • Used VLSR technology from DRAGON project in US to set up light paths.
<JBO Mk2 Westerbork array> Onsala 20-m Kashima 34-m >
shows 94.7% kernel usage and 1.5% idle • shows 96.3% kernel usage and 0.06% idle – no cpu left! • Likelihood is that Onsala Mk 5 marginally faster cpu – at critical point for 512 Mbps transmission • Solution – better motherboards for Mk5’s – about 40 machines to upgrade! Recent results 2: • Why can Onsala achieve 512 Mbps from Mk5 to Mk5 even transatlantic? • Identical Mk5 to JBO • Longer link • iperf TCP JBO Mk5 to Man. rtt ~1ms 4420 byte packets get 960 Mpbs • iperf TCP JBO Mk5 to JIVE rtt ~15ms 4420 byte packets get 777 Mpbs Not much wrong with the networks!
The Future: • Regular eVLBI tests in EVN continue • Testing Mk5 SuperStor interface <-> network interaction • Test upgraded Mk5 recording devices • Investigate alternatives to TCP/UDP – DCCP, vlbiUDP, tsunami, etc. • ESLEA comparing UKLight with production • EU’s EXPReS eVLBI project starts March 2006 • Connection of 100-m Effelsberg telescope in 2006 • Protocols for distributed processing • Onsala-JBO correlator test link at 4 Gbps in 2007 • eVLBI will become routine in 2006!
VLBI Correlation: GRID Computation task Controller/DataConcentrator Processing Nodes