290 likes | 298 Views
Richard Cavanaugh University of Florida HENP SIG Spring Internet2 Meeting. Outline. The Project Science Drivers UltraLight Network Grid-enabled Analysis Summary. The Project. UltraLight is A four year $2M NSF ITR funded by MPS Application driven Network R&D
E N D
Richard Cavanaugh University of Florida HENP SIG Spring Internet2 Meeting R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Outline • The Project • Science Drivers • UltraLight Network • Grid-enabled Analysis • Summary R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
The Project • UltraLight is • A four year $2M NSF ITR funded by MPS • Application driven Network R&D • Two Primary, Synergistic Activities • Network “Backbone”: Perform network R&D / engineering • Applications “Driver”: System Services R&D / engineering • Ultimate goal : Enable physics analysis and discoveries which could not otherwise be achieved R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
NSF/PHY-EPP: Caltech, UF, UM, FIU, FNAL, SLAC et al. Partnership • A New Class of Integrated Information Systems • Includes the Network as an Actively Managed Resource • Based on a “Hybrid” packet-switched and circuit-switched optical network infrastructure • Ultrascale Protocols (e.g. FAST) and Dynamic Optical Paths • Monitor, Manage and Optimize the Use of the Network and Grid Systems in real-time • Using a set of Agent-Based Intelligent Global Services • Built on Top of an already-existing, developing software infrastructure in round-the-clock operation: • MonALISA, GEMS, GAE/Clarens • Exceptional Support from Cisco Advanced Research and Technology Infrastructure (ARTI) Group & Calient • Exceptional NLR, CENIC, Internet2/Abilene, ESnet Support R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
UltraLight Activities TEAMS: Physicists, Computer Scientists, Network Engineers • High Energy Physics Application Services • Integrate and Develop physics applications into the UltraLight Fabric: Production Codes, Grid-enabled analysis, User Interfaces to Fabric • E-VLBI Application Services • Integrate and Develop eVLBI applications into the UltraLight Fabric:Specific Protocols and Bandwidth Management Techniques • Global System Services • Critical “Upperware” software components in the UltraLight Fabric:Monitoring, Scheduling, Agent-based Services, etc. • Network Engineering • Routing, Switching, Dynamic Path Construction Ops., Management • Testbed Deployment and Operations • Including Optical Network, Compute Cluster, Storage, Kernel and UltraLight System Software Configs. • Education and Outreach R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Evolving Quantitative Science Requirements for Networks (DOE High Perf. Network Workshop) R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
History of Bandwidth Usage – One Large Network; One large Research Site ESnet Accepted Traffic 1990 – 2004Exponential Growth Since ’92;Annual Rate Increased from 1.7 to 2.0X Per Year In the Last 5 Years L. Cottrell 10 Gbps W. Johnston W. Johnston Progressin Steps • SLAC Traffic ~400 Mbps; Growth in Steps (ESNet Limit): ~ 10X/4 Years. • July 2005:2x10 Gbps links: one for production and one for research • Projected: ~2 Terabits/s by ~2014 R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
101 Gigabit Per Second Mark! • 2.95 Gbps to+from Rio de Janeiro • 1.0 Gbps to Seoul Unstable end-sytems 101 Gbps END of demo R. Cavanaugh, HENP SIG, Spring Internet2 Meeting Source: SC04 Bandwidth Challenge committee
MonALISA: SC04 Monitoring • Monitoring SCINet, NLR, Abilene, LHCNet, CHEPREO, Int’l NRNs and 9000 Grid Nodes Simultaneously R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
The UltraLight Network • UltraLight has a non-standard core network with dynamic links and varying bandwidth inter-connecting our nodes. • Optical Hybrid Global Network • Core of UltraLight will dynamically evolve with available resources on other backbones • such as NLR, HOPI, Abilene or ESnet. • The main resources for UltraLight: • LHCnet (IP, L2VPN, CCC) • Abilene (IP, L2VPN) • ESnet (IP, L2VPN) • Cisco NLR wave (Ethernet) • HOPI NLR waves (Ethernet; provisioned on demand) • UltraLight nodes: Caltech, SLAC, FNAL, UF, UM, StarLight, CENIC PoP at LA, CERN R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
UltraLight Network Engineering • GOAL: Determine an effective mix of bandwidth-management techniques for the LHC application-space, particularly: • Best-effort and “scavenger” using “effective” protocols • MPLS with QOS-enabled packet switching • Dedicated paths arranged with TL1 commands, GMPLS • PLAN: Develop, test most cost-effective combination of network technologies: • Exercise UltraLight applications on NLR, Abilene and campus networks, as well as LHCNet, and our international partners • Progressively enhance Abilene with QOS support to protect production traffic • Incorporate emerging NLR and RON-based lightpath and lambda facilities • Deploy and study ultrascale protocol stacks (such as FAST) addressing issues of performance & fairness • Use MPLS/QoS and other forms of BW management, and adjustments of optical paths, to optimize end-to-end performance among a set of virtualized disk servers R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
UltraLight Network Infrastructure Elements • Trans-US 10G λs Riding on NLR, Plus CENIC, FLR, MiLR • LA – CHI (2 Waves): HOPI and Cisco Research Waves • CHI – JAX (Florida Lambda Rail) • Dark Fiber Caltech – L.A.: 2 X 10G Waves (One to WAN In Lab); 10G Wave L.A. to Sunnyvale for UltraScience Net Connection • Dark Fiber with 10G Wave: StarLight – Fermilab • Dedicated Wave StarLight – Michigan Light Rail • SLAC: ESnet MAN to Provide 2 X 10G Links (from July): One for Production, and One for Research • Partner with Advanced Research & Production Networks • LHCNet (Starlight- CERN), Abilene/HOPI, ESnet, NetherLight, GLIF, UKLight, CA*net4 • Intercont’l extensions: Brazil (CHEPREO/WHREN), GLORIAD, Tokyo, AARNet, Taiwan, China R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
UltraLight Testbed Now to the Driving Application Work… R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Grid Analysis Environment R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
UltraLight Analysis Architecture Analysis Client Analysis Client • Clients talk standard protocols to “Grid Services Web Server”, • Simple Web service API allows simple or complex analysis clients • Typical clients: ROOT, Web Browser, …. • Clarens portal hides complexity • Key features: Global Scheduler, Catalogs, Monitoring, Grid-wide Execution service. • - Discovery, • - Acl management, • Certificate based • access HTTP, SOAP, XML-RPC Grid Services Web Server Scheduler Catalogs Fully- Abstract Planner Metadata Partially- Abstract Planner Virtual Data Applications Data Mngmt Monitoring Replica Fully- Concrete Planner Execution Priority Manager Grid Wide Execution Service R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Peer 2 Peer System • Allow a “Peer-to-Peer” configuration to be built, with associated robustness and scalability features. • Discovery of Services • No Single point of failure • Robust file download Discover services Discover services Discover services R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Application Frameworks Augmented to Interact Effectively with the Global Services (GS) GS Interact in Turn with the Storage Access & Local Execution Service Layers Apps. Provide Hints to High-Level Services About Requirements Interfaced also to managed Net and Storage services Allows effective caching, pre-fetching; opportunities for global and local optimization of thruput UltraLight Application Services Make UltraLight Available to Physics Applications & their Environments R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
GAE and UltralightMake the Network an Integrated Managed Resource • Unpredictable multi user analysis • Overall demand typically fills the capacity of the resources • Real time monitor systems for networks, storage, computing resources,… : E2E monitoring Application Interfaces Request Planning Monitor Network Planning Network Resources Support data transfers ranging from the (predictable) movement of large scale (simulated and real) data, to the highly dynamic analysis tasks initiated by rapidly changing teams of scientists R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Mobile agents decision support global optimisations MonALISA able to dynamically register & discover Based on multi-threaded engine Very scalable Services are self describing Code updates Automatic & secure Dynamic config for services Secure Admin Interface Active filter agents Process data Application specific monitoring MonALISA:Monitoring Agents using a Large Integrated Services Architecture Fully distributed, no single point of failure! R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
MonALISA Example: Monitoring Network Topology, Latencies, Routers MonALISA R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
MonALISA Services Integrated with Network Services • Dedicated modules to monitor and control optical switches • Used to control • CALIENT switch @ CIT • GLIMMERGLASS switch @ CERN • ML agent system • Used to create global path • Algorithm can be extended to include prioritisation and pre-allocation R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Catalogs to select datasets, Resource & Application Discovery Schedulers guide jobs to resources Policies enable “fair” access to resources Robust (large size) data (set) transfer Feedback to users (e.g. status of their jobs) Crash recovery of components (identify and restart) Provide secure authorized access to resources and services. (Physics) Analysis on the GridMove from Existing Components to a Coherent System Client Application 8 2 1 Steering Dataset service 7 3 Discovery Catalogs 4 9 Planner/ Scheduler Job Submission Execution 6 Storage Management 5 5 Monitor Information Data Transfer Policy Storage Management Ultralight core : data transfer, planning scheduling, (sophisticated) policy management on VO level, integration R. Cavanaugh, HENP SIG, Spring Internet2 Meeting More sophisticated components & services in years 3-4
Clarens Web-Service BackboneThe glue which holds everything together • X509 Cert based access • Good Performance • Access Control Management • Remote File Access • Dyanamic Discovery of Services on a Global Scale • Available in Python and Java • Easy to install and part of VDT distribution: • wget -q -O - http://hepgrid1.caltech.edu/clarens/setup_clump.sh |sh • export opkg_root=/opt/openpkg • Interoperability with other web service environments such as Globus, through SOAP • Interoperability with MonALISA Monitoring Clarens parameters Service publication R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Many GAE Services Integrated with Network Services • CoreClarens services (Including a shell service and remote file access) • File catalog service (contains dataset information) • Sphinx Scheduler (UFL) Service based scheduler • Job Submission BOSS (Collaboration with INFN) • Root Clarens client • Caves (UFL) Analysis code and command sharing environment • Steering service. First prototype of steering service • Discovery Service UltraLight focuses on integration and sophisticated automated decisions based on monitor information R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Example : Remote Data File Access via Clarens R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Example: Sphinx Scheduling Service • Functions as like a Nerve Centre • Data Warehouse • Policies, Account Information, Grid Weather, Resource Properties and Status, Request Tracking, Workflows, etc • Applies Data Mining methods Clarens WS Backbone “?” Grid Client Recommendation Engine • Flexible Framework: • Client (request/job submission) • Clarens Web Service • Grid Clients • Scheduling Service • Clarens Web Service • MonALISA Monitoring Repository • Grid Resource • MonALISA Monitoring Service • Grid Services Grid Resource Grid Resource Grid Resource MonALISA Monitoring Backbone R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
UltraLight Plans • UltraLight envisions a 4 year program to deliver a new, high-performance, network-integrated infrastructure: • Phase I will last 12 months and focus on deploying the initial network infrastructure and bringing up first services • Phase II will last 18 months and concentrate on implementing all the needed services and extending the infrastructure to additional sites (We are entering this phase starting approximately this summer) • Phase III will complete UltraLight and last 18 months. The focus will be on a transition to production in support of LHC Physics; + eVLBI Astronomy R. Cavanaugh, HENP SIG, Spring Internet2 Meeting
Summary • For many years the Wide Area Network has been the bottleneck;this is no longer the case in many countries • Deployment of a data intensive Grid infrastructure is now possible! • Recent I2LSR records show for the first time ever that the network can be truly transparent; throughputs are limited by end-hosts • Challenge shifted from getting adequate bandwidth to deploying adequate infrastructure to make effective use of it! • UltraLight promises to deliver the critical missing component for future eScience: the integrated, managed network • Next generation Network and Grid system • Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component R. Cavanaugh, HENP SIG, Spring Internet2 Meeting