200 likes | 358 Views
Center for Experimental Research in Computer Systems Spring 2007 IAB Meeting. Karsten Schwan, Calton Pu, Douglas Blough, Sudhakar Yalamanchili. IUCRCERCS NSF Industry University Co-operative Research Center. Mission.
E N D
Center for Experimental Research in Computer SystemsSpring 2007 IAB Meeting Karsten Schwan, Calton Pu, Douglas Blough, Sudhakar Yalamanchili IUCRCERCS NSF Industry University Co-operative Research Center
Mission Lead the innovation of new information and computing technologies, to construct the interactive information systems of the future, and to create the intellectual capital that can advance these technologies and fuel future advances. Enterprise Embedded Scientific.Grid Remote access to Information System Information anytime, anywhere Timeliness! Quality! Security! Robustness!
Strategic Thrusts - Highlights • Scientific/Technical Computing -- Dynamic Data Management and • GT: IHPCL Laboratory (e.g., new cluster machines, including substantial Intel donations for education and for multicore computing initiative) • DOE:ORNL, Sandia: High Performance I/O initiative; involvement with startups • Cisco (MPI and IB QoS); Dell; HP/Intel (Gelato, Itanium donations); IBM (VMM power management, Cell SUR grant), RNet communication processor design • News: Multicore Focus (HP, IBM, Intel); Ongoing I/O Initiative, Virtualization in HPC (ONRL, Sandia, UNM)
Strategic Thrusts - Highlights • Enterprise Computing -- Autonomic/Adaptive and Service-Oriented Systems: • IBM, Intel, TCS (autonomic and critical enterprise systems, dynamic content distribution/event-based systems, SOA, virtualization – hypervisor scaleout, I/O virtualization, trusted passages, metering, power management, failure diagnosis and fault containment) • HP (deployment and management, system monitoring, risk-based control, stream data mining) • Worldspan (runtime behavior detection and QoI, virtualization) • LogicBlox (Dynamic code generation for efficient data access) • Delta, Raytheon (policy/performancerobustness, runtime behavior modeling) • Cisco, Intel (network data services, heterogeneous multicore, IB network virtualization) • News/Outreach: IBM SUR (joint with Ohio State), OSU industry partners, OSU IAB meeting, exploring new links: Benchmark, Earthlink, McKesson, NSF CRI Airport LAN High Performance Computing Real-Time Information Transport FAA Flight Data Real-time Decision Tools capture, display, transport, filter, transform Optimization Gate Readers Cluster Computing Real-Time Information Processing Wide-area Transport Operational Flight Displays Passenger paging and response Airport LAN Visualization Crew and Equipment Status Real-time Situation Assessment Storage Databases Scalable Robust Services Baggage Displays Baggage Status
Strategic Thrusts - Highlights • Embedded Systems/Architecture • Boeing (testing, software correctness) • Intel (in-vehicle computing and lightweight methods for system virtualization, system-level power management, computer architecture) • Motorola (middleware for pervasive and mobile applications) • Federal: pervasive applications (transportation, robotics), upcoming Cyberphysical Systems program • IBM, Intel (network processors and heterogeneous multicore) • Sony (gaming applications) • News: Korea program in Embedded Systems, Samsung educational program, Robotics Center liason, Sensor (OSU) and MobiEMU testbeds image quality, end-to-end delay, jitter, loss rate throughput, response times
CERCS Personnel • Faculty • Mustaque Ahamad, Mostafa Ammar, Doug Blough, Constantinos Dovrolis, Greg Eisenhauer, Richard Fujimoto, Ada Gavrilovska, Alexander Gray, Mary Jean Harrold, Hsien-Hsin Lee, Wenke Lee, Ling Liu, Gabriel Loh, Pete Manolios, Alex Orso, Henry Owen, Santosh Pande, Milos Prvulovic, Calton Pu, Kishore Ramachandran, Jay Ramanathan (Ohio State), Rajiv Ramnath (Ohio State), George Riley, David Schimmel, Karsten Schwan, Olin Shivers, Matthew Wolf, Hongyan Zha, Sudhakar Yalamanchili, Ellen Zegura • Research Staff • Steve Ferenci, David Hilley • Supported by DARPA, DOE, NSF, (CoC), (ECE) • Associated Faculty/Researchers • David Bader, Tucker Balch (Robotics), Patrick Bridges (UNM), Robert Butera, Steve DeWeerth, Irfan Essa, Phil Hutto, Byron Jeff (Clayton State), Scott Klasky (ORNL), Kang Li, Sung Kyu Lim, Arthur Maccabe (UNM), Vincent Mooney, Jeff Nichols (ORNL), Krishna Palem, Kalyan Perumalla (ORNL), Jeff Vetter (ORNL), Patrick Widener (UNM)
Industrial Relations • IUCR CERCS Center • Contributors (GT): Boeing, Cisco, Delta, DOE, HP, IBM, Intel, LogicBlox, TCS, Worldspan • Industry Workshops and Industrial Advisory Board • Joint initiatives - e.g., TIE grant with UFL, expansion to Ohio State (joint curriculum/facility efforts), planned expansion to UNM • Internship Program • Amazon, ATT, CISCO, Delta, (DoCoMo), DOE, Google, HP, IBM, Intel, Microsoft, Motorola, NetApp, Radisys, TCS, VMWare, Worldspan • Evolving relationships: • ATT, DoCoMo, Microsoft, Motorola, NetApp, Netronome, Raytheon, RNet, VMWare, Xilinx
Overview - Current Industry Engagements • One slide per ongoing project • Federal projects (e.g., joint work with DOE ORNL) elided, so, few HPC efforts described • Same order: HPC, Enterprise, Embedded
HPC: Cisco - Infiniband-based Research Gavrilovska, Schwan, Wolf • Mechanisms for delivering end-to-end QoS levels in challenging settings: • Multi-core nature of future HPC nodes • I/O limitations in high-performance infrastructures • End-to-end virtualized environments • Two main efforts under current investigations: • Data virtualization: ‘Datatap’ mechanism on top of low-level IB verb interface to (1) extract data from IB infrastructure, (2) middleware mechanisms to support dynamic extensions for service-oriented applications, (3) dynamic, resource-aware routing and data distribution to meet application QoS requirements. • Platform virtualization: (1) integrate x86-based virtualization solutions into Infiniband settings, (2) develop mechanisms for end-to-end QoS for VM-to-VM interactions by improved and dynamic resource management and scheduling mechanisms. Additional Effort :RNET – high end NIC for science applications
Enterprise: Elba Project – HP Labs (5) Reconfiguration Calton Pu • Apply code generation techniques to automate large system deployment, measurement, evaluation, and management • Collaborative work (1 faculty-Pu, 1 industry-Sahai, 6 PhD stud., 4 MS stud., 1 undergrad.) • 8 published papers in 2 years, several more in the pipeline (4) Evaluation & Analysis (1) Design Automated System Mgmt (2) Code Generation (3) Deployment Current work: (step 4 above) Evaluation of 3-tier benchmark (RUBiS) using generated scripts (millions of lines of deployment, measurement, and analysis scripts)
Enterprise:Robust Delivery of Quality Data - Worldspan Karsten Schwan, Mohamed Mansour, Jay Lofstead Problem: Complex GDS with potentially unanticipated behaviors Example: Variable search times due to caching effects Solution: Runtime behavior detection, model construction, and mitigation Specific approach: Mitigation via request reordering
Enterprise:Runtime Behavior Diagnosis – Delta Air Lines Sandip Agarwala, Mohamed Mansour, Karsten Schwan • Investigation of multiple enterprise architectures • Revenue Pipeline, delta.com, DNS • Path detection in complex systems • Autonomic workflows • Monitoring and management in SOA systems (proposed work)
Enterprise: Collaboration with IBM Research Ling Liu Distributed Systems and Software Dynamic Content Dissemination: Architectures and Optimizations Collaborators: Arun Iyengar, Fred Douglis, Isabelle Rouvellou Event Streams and Security Sensor Stream Processing and Optimization (e.g., load shedding, load balancing, motion adaptive indexing) Event Stream Mining Collaborators: Philip Yu, Bugra Gedik, Rong Chang Service Oriented Computing Secure publish-subscribe systems, Secure Event Dissemination Collaborators: Arun Iyengar, Liang Jie Zhang 13
Enterprise/HPC:High Productivity Computing Sudha Yalamanchili with LogicBlox Inc. Memory Inputs Outputs Inputs Outputs Application Kernel Run-time Kernel CPU CPU CPU ACC ACC ACC FIFO FIFO FIFO Local Memory Local Memory Local Memory Cache Cache Cache DMA DMA DMA Network (e.g., Hypertransport) • Stream computing programming model • Kernels expressed in a declarative programming language • Custom hardware for accelerating data intensive kernels • Explicit interaction model: non-coherent shared memory • Focus on applications such as retail forecasting and data analytics
apply apply apply apply apply apply Enterprise/HPC: Databus: Runtime Rule Generation - LogicBlox Rules Greg Eisenhauer • Fine grain retail data analysis (what-if calculations) • Rule-based declarative language • Compile down to interpreted “FactBus”, • Rule objects and variable objects. Apply rules top to bottom, fallback on failure. • Use DCG for “Just-In Time” compilation • Initial results, speedups of 3. • Additional improvements anticipated. Write Test DCG subset BinOp BinOp Lookup Iterator DB double double int int String bool FactBus Variables
Enterprise/HPC:Scalable Hypervisors - Intel Karsten Schwan
VM 2 VM 3 Dom0 Application Application VPM Channel PM Policy OS OS VM 1 Dom0 VMM Application VPM Mechanisms VPM Channel PM Policy OS Platform HW VMM VPM Mechanisms Platform HW Enterprise:Power Management in Virtualized Systems Ripal Nathuji Karsten Schwan IBM + Intel Heterogeneity-aware Allocation Policy • Coordinate virtualized system management: • Enable VM management independence • Decouple virtual and physical resources for management • Introduce “soft” scaling for flexible management • Leverage heterogeneity in: • Performance capabilities • Power efficiency of resources • Power management support
Enterprise:Trusted Passages on Virtualized Platforms – Intel/NSF Mustaq Ahamad Greg Eisenhauer Wenke Lee Karsten Schwan Overlay node1 Overlay node2 Service VM Guest VM1 Guest VM2 Service VM Guest VM1 Guest VM2 Trust Controller Trust Controller Host1 App. App. Host2 BE FE FE BE FE FE Hypervisor Hypervisor network NIC network NIC network Trusted passage • Run trusted services across untrusted platforms: • Trust models and trust controller mechanisms for evolving node trust • Virtual Machines Monitoring and Introspection to support trust controllers • Data Interception and Redirection as Remedial Measures
Embedded/Enterprise:Aristotle Research Group(Mary Jean Harrold) Testing Evolving Software (TCS) (with Alex Orso) Problem Changes • require rapid modification and testing for quick release • causing released software to have many defects Fault Propagation for Safety (Boeing) Problem Critical avionics systems • now use integrated modular avionics • making fault analysis for the entire system difficult Research Question How can we perform fault analysis at the system-model level and make this information accessible to developers? Research Question How can we test well (to gain confidence in changes before release of changed software) FauPA Propagates injected faults forward to determine impact; Traces faulty components backward to find root cause MaTRIX Computes conditions test cases must satisfy to test changes well