1 / 27

Results of Single-Event Effects Testing of Advanced Networks

Results of Single-Event Effects Testing of Advanced Networks. S. Buchner, J. Howard, C. Seidleck, P. Marshall, M. Carts, H. Kim, K. LaBel, R. Stattel, C. Rogers and T. Irwin. NASA Electronic Parts and Packaging (NEPP) Program’s Electronic Radiation Characterization (ERC) Project

isaura
Download Presentation

Results of Single-Event Effects Testing of Advanced Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Results of Single-Event Effects Testing of Advanced Networks S. Buchner, J. Howard, C. Seidleck, P. Marshall, M. Carts, H. Kim, K. LaBel, R. Stattel, C. Rogers and T. Irwin. • NASA Electronic Parts and Packaging (NEPP) Program’s Electronic Radiation Characterization (ERC) Project • NASA Remote Experimentation and Exploration Program • DTRA RHM MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  2. Introduction • Future space missions will have to process vast amounts of data (supercomputing levels) in a radiation environment. • The data will have to be transferred between instruments and computers or between computers before downlinking. • High performance networks that consist of COTs parts that are likely to be sensitive to SEEs will be needed. Computer #2 Instrument Computer #1 Switch MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  3. Introduction • Using protons and heavy ions at accelerators, we have performed Single-Event Effects testing of: • AD8151 Crosspoint switch • Myrinet Crossbar switch • FireWire (IEEE1394) serial bus MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  4. Introduction • Ionizing particles caused: • Single-Event Upset (SEU or bit error) • Single-Event Functional Interrupt (SEFI) • Single-Event Latchup (SEL) MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  5. AD8151 Digital Crosspoint Switch • Operates at 3.2 Gbps • Low power • 33 Inputs • 17 Outputs • Tested vs: • Data rate • # of paths • Ion LET MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  6. AD8151 Digital Crosspoint Switch . Bipolar switch MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  7. Myrinet • Network system for a cluster architecture consists of two pieces: • network switching (XBar16) • network interface hardware (Network Interface Card (NIC)) • A prime control processor talks to all the individual nodes of the cluster via the Myrinet network. • Messages transported across Myrinet as packets. A packet consists of: • Header • Payload • Trailer MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  8. Myrinet MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  9. IEEE 1394 (FireWire) • Specifications for backplane and cable • Cable contains 6 wires with maximum length of 4.5 meters. • Cable minimizes wire harness, provides power, reduces cross talk. • More than one node can access the bus at a time. • Inexpensive, available, reliable - COTS. • Scaleable 100, 200, 400 MHz ( 800, 1600 and 3200 MHz). • Data transmitted in packets with “Header,” “Data,” and “Checksum” • Two modes - Isochronous and Asynchronous. • 256 Terabytes of addressable memory-mapped space (48 bits per node, 63 nodes per bus segment and 1024 bus segments). • Plug and play. MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  10. 0 Physical ID 2 DV Monitor 2 1 0 0 Physical ID 4 Physical ID 3 PC Digital VCR 1 2 0 0 0 0 Physical ID 1 Physical ID 6 Physical ID 0 Physical ID 5 Fixed disk drive Fixed disk drive DV Camcorder D Camera 1 1 1 1 2 2 2 2 IEEE 1394 (FireWire) MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  11. Bus manager Cycle Ctl Packet Xtr Packet Rcv IEEE 1394 (FireWire) • Tested as a function of: • Mode • Ion LET • Layer MICROPROCESSOR TRANSACTION LAYER (Read, Write, Lock) Isoc. Resource Manager LINK LAYER Node Controller PHYSICAL LAYER Serial Bus Management Arbitration En/Decode Data Resync Connectors Bus Initial. Sig. Levels 1 MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  12. IEEE 1394 (FireWire) . PHYSICAL LINK MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  13. General SEE Results • SEEs are remarkably similar in all three network components: • HARD ERRORS • Single-event functional interrupts (SEFIs) requiring reprogramming or rebooting to restart communications. • SOFT ERRORS • Bit errors (AD8151), • SEUs (FireWire) • Lost packets (Myrinet) . MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  14. AD8151 Cross Point Switch MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  15. Results of SEE Testing - AD8151 • Tested with BER tester • Observed: • Single bit errors • Bursts of errors • Loss of synchronization MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  16. Results of SEE Testing - AD8151 • Tested with BER tester • Observed: • Single bit errors • Bursts of errors • Loss of synchronization MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  17. Results of SEE Testing - AD8151 • Tested with BER tester • Observed: • Single bit errors • Bursts of errors • Loss of synchronization MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  18. Results of SEE Testing - AD8151 • Tested with BER tester • Observed: • Single bit errors • Bursts of errors • Loss of synchronization MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  19. Myrinet Cross Bar Switch MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  20. Results of SEE Testing - Myrinet • Myrinet tested via NICs and in-house software • SEFI events occur when all data packets are lost - requires power cycle to recover • Data packets are dropped whenever Checksum is in error. • SEU events are seen as either single packet loss or multiple packets in succession are lost, but normal operation recovers without any intervention. MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  21. Results of SEE Testing – SEFIs – Myrinet • NIC results (table) show about the same cross section, independent of part hit. • Xbar results (graph) show cross section difference between front and back plane switches. MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  22. Results of SEE Testing – SEUs – Myrinet • NIC results (table) show a variety of cross sections for the different parts. • Xbar results (graph) show single packet loss cross section difference between front and back plane switches. MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  23. IEEE 1394 FireWire MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  24. Results of SEE Testing – SEFIs – IEEE 1394 (FireWire) • 9 different SEFIs categorized according to how to restart communications. • Observed soft errors that did not disrupt communications MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  25. Results of SEE Testing – SEFIs – IEEE 1394 (FireWire) • 9 different SEFIs categorized according to how to restart communications. • Observed soft errors that did not disrupt communications MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  26. Results of SEE Testing – SEUs – IEEE 1394 (FireWire) • 9 different SEFIs categorized according to how to restart communications. • Observed soft errors that did not disrupt communications MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

  27. Summary and Implications • High performance networks contain COTS parts that are sensitive to SEE. • Observed SEEs common to all three “networks”: • SEFIs following which reboot or reprogram was needed • SEUs that lead to a short drop out of valid data • SEL in the NS FireWire but not in others • SEE mitigation steps will be needed for use in space MAPLD2002, Laurel MD 11th September, 2002. Presented by S. Buchner

More Related