310 likes | 323 Views
This document provides an overview of the performance and features of the Venus PCI card for Solaris/SPARC platforms. It includes information on gigabit Ethernet, SSL and IPsec acceleration, and the Venus crypto engine.
E N D
Understanding Venus Performance(Tentative Update 2003-11-04) Shih-Hao Hung Performance and Availability Eng. Sun Microsystems Inc.Sun Confidential/Proprietary – Internal Use Only
Overview • Venus is a PCI card that provides the following functionalities for Solaris/SPARC platforms: • High-performance gigabit Ethernet interface • High-speed cryptographic engine • SSL acceleration • IPsec acceleration • Our goals are to make sure • Software (Apps, OS, and drivers) utilize Venus as efficiently as possible. • Venus performs under mixed workload. Sun Confidential/Proprietary - Internal Use Only
Venus Gigabit Ethernet • Venus uses the Cassini chip that is also used by other Sun Gigabit Ethernet cards such as BGCC, Kuheen, etc. • One major difference between Venus and other Cassini-based cards – Venus can interrupt only ONE host processor, due to the limitation of the Intel bridge chip on the Venus card. Sun Confidential/Proprietary - Internal Use Only
Venus Gigabit Ethernet TCP Performance Measurement • The peak throughput of Venus is on par with the other Cassini-based cards (without MDT). Sun Confidential/Proprietary - Internal Use Only
New Gigabit Ethernet Features Proposed for Venus 1.1 • Hardware Checksuming: Should help reduce CPU consumption, but has a bug at this moment (2003-06-05). • Jumbo Frames (JF)*: Data show that jumbo frames improves IPsec acceleration by ~3X with SysKonnect cards. Support for jumbo frames may be put in Venus 1.1. • Multi-Data Tranfer (MDT)*: MDT is already in Solaris 9 Update 4 for Cassini, saving CPU cycles and improving efficiency (i.e. Mbps/Mhz ratio) by up to 60%. Data show 5-10% performance gain on SPECweb99 and Netperf with MDT-enabled Cassini card and driver. * JF & MDT may not be supported in 1.1. Sun Confidential/Proprietary - Internal Use Only
Venus Crypto Engine • Venus has two Broadcom 5821 crypto chips. It is possible for Venus hardware to offload the following crypto operations: • Public-key ops: RSA (512-bit, 1024-bit, 2048-bit) and DSA • Bulk encryption ops: RC4*, DES, 3DES • Hash ops: SHA1 and MD5 * RC4 support is disabled in the Venus 1.0 driver. • Software crypto is available for fail-over and small tasks. Sun Confidential/Proprietary - Internal Use Only
Our Performance Data • We have conducted performance testing on the following platforms: • 2-way 900mhz Sun Fire 280R • 8-way 900mhz Sun Fire V880 • 12-way 1200mhz Sun Fire 6800 • The per-CPU numbers presented are based on the 900mhz UltraSPARC III cu processor. Sun Confidential/Proprietary - Internal Use Only
The Venus Crypto Engine Sun Confidential/Proprietary - Internal Use Only
Venus Crypto HardwarePerformance Measurement Sun Confidential/Proprietary - Internal Use Only
Venus Crypto SoftwarePerformance Measurement Sun Confidential/Proprietary - Internal Use Only
The Venus SSL Performance Sun Confidential/Proprietary - Internal Use Only
Venus SSL Support (Tentative, check when spec. is final) Sun Confidential/Proprietary - Internal Use Only
Venus SSL Performance * HW bulk encryption support is disabled by default for S1WS for Venus 1.0. Sun Confidential/Proprietary - Internal Use Only
Venus SSLAdditional Performance Issues • Enabling HW bulk encryption support cause extra overhead for key management operation: • SSL handshake performance is reduced by 33% (BugID 4814633) • Short-term fix: disable HW bulk encryption by default; offer a mechanism for users to enable the support. • RFE# 4753295: Should find a way to reduce the key management overhead. • Update (2003-06-06): The gap has been shrunk to ~14% with latest Venus 1.1 software. • Enabling HW bulk encryption support may limit the SSL throughput • Affect mostly large systems • Customer may choose to disable the support, or buy additional cards. Sun Confidential/Proprietary - Internal Use Only
The Venus IPsec Performance Sun Confidential/Proprietary - Internal Use Only
Solaris IPsecPerformance Issues with 3DES • The Stock Solaris 9 (update 3) IPsec-3DES is slow and does not scale. • 3DES code is not optimized. • 3DES jobs are done synchronously. • Packets are processed sequentially. • 28 mbps on a 2-way 900mhz E280R, only one CPU is utilized. Sun Confidential/Proprietary - Internal Use Only
Venus IPsecDesign Considerations • Accelerates DES/3DES encryption/decryption via: • Asynchronous processing by KCL2 job scheduler, • Performance-optimized software crypto, • Hardware offloading engine, • Must process jobs at Ethernet packet size, 1460 bytes, which is much smaller than the SSL chunk size. • A big constraint for hardware offloading, a big issue of IPsec acceleration compared to SSL acceleration. • Impacted by hardware offloading overhead • Packets < 512 bytes are not offloaded – overhead too costly • Light weight ciphers such MD5 and SHA1 are harder to benefit from hardware offloading. Sun Confidential/Proprietary - Internal Use Only
Venus IPsec Implementations • Venus accelerates IPsec in one of the following two forms: • Out-of-band: • Packets are sent to Venus crypto for encryption, and then sent to any NIC for transmission. • Packets are received from any NIC, and then sent to Venus crypto for decryption. • In-band (pending Solaris 9 Update 5): • Packets are sent to Venus crypto for encryption and transmitted via Venus NIC in one trip. • Packets are received by Venus NIC and decrypted by Venus crypto before entering the host. • The in-band implementation will really reflect the strength of Venus, but it requires significant changes to the network stack. Sun Confidential/Proprietary - Internal Use Only
Venus Out-of-Band IPsec ipsecesp • Venus out-of-band IPsec requires minor changes to an existing system: • New modules replacing encrdes/encr3des modules. • For pkt < 512 bytes, swcrypto handles 3des • For pkt >= 512 bytes, KCL handles 3des • KCL sends jobs to vca for hardware offload when a Venus card is available • KCL sends jobs to its software crypto when hardware offloading is not available. Venus encr3des pkt <512 pkt >=512 swcrypto KCL no hardware hardware ok Software3des vca Hardware3des Sun Confidential/Proprietary - Internal Use Only
Venus IPsecPerformance Benefits • Accelerates IPsec-3DES throughput • To 105 mbps on a 1-way 900mhz E280R. • 375% speedup compared to stock S9u3. • Improves throughput scalability • Asynchronous crypto processing scales throughput to 210 mbps on 8-way 900mhz V880. • 750% speedup compared to stock S9u3. • Reduces IPsec latency. • Asynchronous crypto processing improves parallelism and hence reduces the latency in 3DES encryption/decryption. Sun Confidential/Proprietary - Internal Use Only
Venus IPsecTCP Unidirectional RX Throughput Per CPU numbers measured on 900mhz E280R. Per Card number measured on 12-way 1.2Ghz SF6800. The TX or bi-directional throughput is similar to RX, but is ~15-20% slower. The on-going FireEngine project may be able to address this issue by making IP MT-hot. Sun Confidential/Proprietary - Internal Use Only
IPsec Latency • IPsec adds substantial latency, and thus affects mostly • Applications that demands low network latency. • The transaction rate for single-threaded applications. • Venus reduces IPsec latency via fast and asynchronous crypto processing, • The graph shows latency reduction by Venus software and hardware. • Tuning can be applied thru Encr3DesTuning and unloading the vca module to minimize latency for specific apps. Note: Encr3DesTuning is set to 256 in this set of data. Default is 512. Sun Confidential/Proprietary - Internal Use Only
Jumbo Frames and Venus IPsec Acceleration • Venus IPsec acceleration is sensitive to packet size. • Significant overhead for regular Ethernet packets (MTU=1500). • Overhead reduced for bigger MTU (Jumbo Frames). • Performance data measured with SysKonnect 9821 Ethernet card and Venus out-of-band IPsec acceleration show ~3X performance. Sun Confidential/Proprietary - Internal Use Only
The Venus Performanceunder Mixed Workload Sun Confidential/Proprietary - Internal Use Only
Venus PerformanceUnder Mixed Workload • Possible scenarios: • Mixed non-IPsec and IPsec traffics • Mixed non-IPsec and SSL traffics • Would NIC operations interfere with crypto operations? • Yes, because both the NIC and the crypto chips share one interrupt line. • NIC can generate interrupts much more rapidly than the crypto chips typically do. • BugID: 4799279 Sun Confidential/Proprietary - Internal Use Only
Venus PerformanceUnder Mixed Workload (cont.) • Crypto performance suffers when network traffic is high. • 30% to 90% 3DES performance degradation (hurts IPsec) • 50% to 80% RSA performance degradation (hurts SSL) • Ideal (long-term) fix would be to have separate interrupt lines for crypto and NIC. • Workaround is available: • Use rx-intr-pkts and rx-intr-time to limit the interrupt rate from the NIC. • However, it reduces NIC performance up to 30%. • Still Working on bug fixes in 1.1 (2003-06-09). Sun Confidential/Proprietary - Internal Use Only
Summary Sun Confidential/Proprietary - Internal Use Only
Extra Materialsfor Technical Discussions Sun Confidential/Proprietary - Internal Use Only
IPsec TCP_RR Latency Sun Confidential/Proprietary - Internal Use Only
Netperf TCP_RR Latency Sun Confidential/Proprietary - Internal Use Only