210 likes | 346 Views
Physical Buildout of the OptIPuter at UCSD. What Speeds and Feeds Have Been Deployed Over the Last 10 Years. Doublings. 16 - 32 x 10000Mb. DWDM Capability. 13. OptIPuter Infrastructure. Performance per Dollar Spent. Uplink Speed. Endpoint Speed. 10. 10000Mb. Rockstar. 7. Wiglaf.
E N D
What Speeds and Feeds Have Been Deployed Over the Last 10 Years Doublings 16 - 32 x 10000Mb DWDM Capability 13 OptIPuter Infrastructure Performance per Dollar Spent Uplink Speed Endpoint Speed 10 10000Mb Rockstar 7 Wiglaf 1000Mb 10Mb 0 2 4 6 8 10 Number of Years Scientific American, January 2001
UCSD is Prototyping a Campus-Scale OptIPuter The UCSD OptIPuter Deployment 0.320 Tbps Backplane Bandwidth Juniper T320 ½ Mile To CENIC and NLR Dedicated Fibers Between Sites Link Linux Clusters SDSC SDSC SDSCAnnex Calit2 Preuss High School JSOE Engineering Cisco 6509 8 – 10GigE CRCA SOM Medicine 6thCollege Phys. Sci -Keck Collocation Node M Earth Sciences SIO Chiaro Estara Source: Phil Papadopoulos, SDSC; Greg Hidley, Cal-(IT)2
Different Kind of Experimental Infrastructure • UCSD Campus Infrastructure • A campus-wide experimental apparatus • Different Kinds of Cluster Endpoints (scaling in the usual dimensions) • Compute • Storage • Visualization • 300 + Nodes available for experimentation (ia32, Opteron, Linux) • 7 different labs • Clusters and Network can be allocated and configured by the researcher at the lowest level • Machine SW configuration: OS (kernel, networking modules, etc), Middleware, OptIPuter System Software, Application Software • Root access given to researchers when needed • As close to chaos as we can get • Networks • Packet oriented network. 10 Gbps/site. Multiple 10GigE where needed • Adding lambda capability (Quartzite: Research Instrumentation Award)
What’s Coming Soon? • 10 GigE Switching • Force 10 e1200. Initially with sixteen 10GigE Connections • Expansion is $6K/Port + Optics ($2K for Grey, $5K for DWDM) • Line Cards, Grey Optics here. Awaiting Chassis • Force 10 S50 Edge Switches • 48-port GigE + two 10GigE uplinks ~ $10K with Grey Optics • 10 GigE NICs • Neterion • PCI-X (Intel OEM) with XFP (just received) • Myrinet 10G (PCI Express)– Ready to place Order • DWDM • On Order: four 10GigE XFPs, 40KM, Channels 31,32 (2 each). • Delayed: Expect arrival in March (Sigh). • Following NASA’s lead on the DWDM Hardware (Very good Results on Dragon) • Arrived: two 8 channel Mux/DeMux from Finisar • DWDM Switching • Expect Wavelength selective switch this summer.
What’s Changing II • “Center Switching Complex” moving to Calit2 • Should be done my end of March • A modest number of endpoint for OptIPuter Research will be added • A larger Number (e.g. CAMERA) of “production” resources added • Increasing emphasis on longer haul connections • Connections to UCI
Quartzite: Reconfigurable Networking • NSF Research Instrumentation, Papadopoulos, PI • Packet network is great • Give me bigger and faster of what I already know • Even though TCP is challenged on big pipes • What about lambdas? And switching lambdas? • Existing Fiber Plant is fixed. • Want to Experiment with different topologies? -> “buy” a telecom worker to reconnect cables as needed • Quartzite: Research Instrumentation Award (Started 15 Sep) • Hybrid Network “Switch stack” at our Collocation Point • Packet Switch • Transparent Optical Switch • Allows us to physically build new topologies without physical rewiring • Wavelength-Selective Switch • Experimental device from Lucent
Quartzite: DWDM $10K/ switch $5K/XFP + $2K/Channel (Mux/demux) = $14K/Connected Pair • Cheap uncooled lasers • 0W Optical splitters/combiners • 0.8nm spacing for DWDM • 1GigE, 10GigE Bonded or Separate www.aurora.com www.optoway.com www.fibredyne.com Single fiber pair
UCSD Quartzite Core at Completion (Year 5 of OptIPuter) • Funded 15 Sep 2004 • Physical HW to Enable Optiputer and Other Campus Networking Research • Hybrid Network Instrument Reconfigurable Network and Enpoints
Scalable and automated network mapping for Optiputer/Quartzite Network Optiputer AHM Meeting San Diego, CA January 17 2006 Praveen Jagadishprasad Hassan Elmadi Calit2, UCSD Phil Papadopoulos Mason Katz SDSC
Motivation • Management • Inventory • Troubleshooting • Programming the network • Ability to view and manipulate the network as a single entity. • Aid network reconfiguration in a heterogenous network • Experimental networks have high degree of reconfiguration • Glimmerglass based physical changes • VLAN based logical topology changes • Final goal to automate the reconfiguration process. • Focus on switch/router configuration process
Automated Discovery • Minimal input needed. • One gateway might be sufficient • SNMP based discovery • Not tied to vendor protocol • Tested with Cisco, HP, Dell, Extreme etc • Almost all major vendors support SNMP • Fast • Discovery process highly threaded • 3 minutes for UCSD optiputer network (~600 hosts and 20 switches) • Framework based • Extensible to include mibs for specific switch/router models. For example • Cisco vlans • Extreme trunking
Design for discovery and mapping • Phase 1 ( Layer 3 ) • Router discovery • Subnet discovery • Phase 2 ( Layer 2) • Switch discovery • Host discovery • Switch <---> Host mapping • IP arp mapping • Phase 3 • Network mapping • Form integrated map through novel algorithms • Area of research • Phase 4 • Web based Viz • Database storage
Future work • Reliable discovery of logical topology ( VLANs) • Automate generation of switch/router configs • Use physical topology information to aid config generation • Fixed templates for each switch/router model • Templates are extended depending on configuration needed • Batch configuration of switches/routers • Support Custom VLANS with only end-host specification • Constructing spanning tree of end-host and intermediate switches/routers\ • Schedule dependencies for step-by-step configuration • Physical topology information essential
Optiputer Network Inventory Management – Logical View • Logical topologyadds an VLAN table to the physical topology tables. • VLAN composed of trunks. • Each Trunk can be a single/multiple port to port connection between same set of switches • Schema supports retaining VLAN id when modifying trunks and vice-versa. LOGICAL TOPOLOGY (Single VLAN) GRAPH
Look at Parallel Data Serving 8 Lustre Clients 8 Lustre Clients 8 Lustre Clients 8 Lustre Clients 10 Lustre File Servers 10 Lustre File Servers 10 Lustre File Servers 10 Lustre File Servers 10 Lustre File Servers 10 Lustre File Servers 10 Lustre File Servers 10 Lustre File Servers • 128 node Rockstar Cluster (Same as SC2003 Build) • 1 SCSI Drive/File Server Node 48 Port GigE + 10GigE Uplink 8 8 8 8 48-port GigE 48-port GigE 48-port GigE 48-port GigE
Basic Performance • 32, 8, 16, 4 nodes reading the same 32 GB file • Under these Ideal Circumstances, able to read more than 1.4GB/sec from disk • Writing different 10 GB files from each nodes: about 700MB/s
Why a Hybrid Structure • Create different physical topologies quickly • Change when site/node is connected via packet, lambda or a hybrid combination • Want to understand the practical challenges in different circumstances • Circuits don’t scale in the Internet Sense • Packet switches will be congested in for long-haul • Real QoS is unreachable in the ossified Internet • The engineering compromise is likely a hybrid network • Packet paths always exist (internet scalability argument) • Circuit paths on demand • Think private high-speed networks not just point-to-point
Summary • OptIPuter is addressing a subset of the research needed for figuring out how to waste (I mean utilize) bandwidth • Work at multiple levels of the Software stack – protocols, virtual machine construction, storage retrieval • Trying to understand how lambdas are presented to applications • Explicit? • Hidden? • Hybrid? • Building an experimental infrastructure as large as our budget will allow • OptIPuter is already international in scale at 10gigabit. • Approximating the Terabit Campus with Quartzite