180 likes | 227 Views
A Virtual-Machine Approach to Creating Complex NPU Applications in the Blink of an Eye. February 2005. Empowering Network Processors. Very-high-level packet processing language Virtual machine abstracting NPU details Built-in functionality for deep packet processing.
E N D
A Virtual-Machine Approach to Creating Complex NPU Applications in the Blink of an Eye February 2005
Empowering Network Processors Very-high-level packet processing language Virtual machine abstracting NPU details Built-in functionality for deep packet processing • Premise: network processors will be a core building block of next-generation networking equipment • Programmability • Versatility • Performance PPL application Virtual machine Virtualized packet processor IP Fabrics Application in NPU microcode Network processor But major obstacle is difficulty of programming network processors And code is architecture and model specific Intel and others Value proposition: • Faster time to market • Lower development and lifetime costs • Scaleable to new silicon • Portable to different architectures • Enable a larger community to use NPUs ASICs Merchant silicon Gen purpose processors
The NPU Catch Special form of the SRAM inst 41 55 45 Local_CSR inst ME (Micro-engine) 8 512 (128 x 4) CAP inst “Run Of the mill instruc-tions” 266 128 8 17 CAP inst MSF inst 256 59 PCI inst 256x16 Access to the transfer and local CSR registers of any other ME 12 128 41x16 What makes NPUs so powerful as solutions to networking-systems design is also what makes their software development a significant challenge • Application software needs to manage interconnect, memory overlap, caching, etc • C programs still very low level, highly machine dependent Massive register resources (in IXP2800, 15,452 software visible registers not counting local mem and CAMs; 25,948 counting these) Parallelism (microengines, hardware threads) Multiple memory types Small program memory space OS Cipher units, hash, CAMs, rings, signals, etc No OS underneath Pages of programming architecture quirks, errata
Virtual Machine Approach to NPU Software Group1: Policy PATTERNS DATABASE($idslist) FIND(Rr1,0,Fuf) Intruderset: Policy ASSOCIATE NUMBER(10000) SEARCHKEYS(IP_SOURCE) TIMEOUT(10000) Intruders: Policy RECALL SEARCHKEYS(IP_SOURCE) LINKED(Intruderset) Secure1: Policy CRYPTO TRANSFORM(3DES,SHA) TIMEOUT(3000) TUNNEL(10.0.42.32) Diversion: Policy PACKET INSERT(PREP,header_size,0) Rule EQ(TCP_SYN,1) EQ(TCP_RST,1) DROP # Protocol anomaly Rule EQ(TCP_SYN,1) EQ(TCP_FIN,1) DROP # Another protocol anomaly Rule EQ(IP_SOURCE,MYIPADDR) DROP # Source spoofed packet Rule EQ(IP_SOURCE,public) APPLY(Intruders) Rule EQ(ac,0) DROP # Previously detected intruder Rule NE(IP_DEST,MYIPADDR) EQ(ICMP_TYPE,ECHO) DROP # no pings to the inside Rule EQ(IP_PROT,ICMP) EQ(IP_MF,1) DROP # fragmented ICMP is DoS attack Rule SCAN(”|0D0A5B52504C5D3030320D0A|”) JUMP(found_subseven_trojan) Rule EQ(IP-DEST/24,190.10.10.0) SET(R0,192.68.0.0) ADD(R0,IP_DEST/0.0.0.255) SET(IP_DEST,R0) #Xlate 190.10.10.X to 192.68.0.X Rule EQ(IP_DEST/24,boston_gateway) EQ(IP_SOURCE,portland_gateway/24) APPLY(secure1) FORWARD(1) Rule EQ(IP_DEST/24,190.10.10.0) APPLY(Group1 FORWARD(2) . . . Dynamic peephole firewall SIP proxy/ offload Packetcable layer 7 traffic management Content specific filters (e.g., email spam) Lawful content listening Session Border Controller Encrypted content switch Content specific DoS attacks Two-way encryption gateway Layer 7 bandwidth monitoring Layer 7 protocol specific firewall Dynamic intrusion blocking TCP offload Intrusion signature scans IPSec VPN Basic firewall Layer 4 load balance Layer 7 content switch Layer 3,4 DoS attacks PPL compiler PPL virtual machine NPU
Two Routes N P U ? ? DRAM transfer registers N P U Virtual Machine Context arbitration 16 microengines PPL Language • PPL: A very-high-level functional language to express packet processing • Virtual machine on NPU fully exploits parallelism while hiding it • PPL also includes very powerful primitives, e.g., • Scan packet payload • Match payload to regular expression • Encryption/authentication • Manage connections (e.g., TCP, SIP) • Manage “superpackets” • High-speed multi-pattern matching PPL Language 128 hardware threads Thread signals Errata Instruction sequence restrictions Inter-instruction timing Next neighbor registers 640 word local memories Dispatch loops Scratch rings A and B register banks Processor synchronization ALU instructions Aligned accesses only Byte index register No OS Register scope SRAM transfer registers Register lifetime 90% of time spent on underlying tools, devices, details 10% of time spent on application value Very specific to NPU model and family 90% of time spent on application value Scaleable Portable
PPL – a Fundamentally Different Approach Time/$ spent on application value Tools to help you write and debug microcode. And far removed from the world of packet processing. You still need to understand the NPU’s microcode environment, create the microcode, debug it, maintain it. Application machine. You think about packet processing and express your application in a very-high-level application language. R&D focus is on the value-add in the application, not the many many details of the NPU. NPU tools PPL virtual-machine environment Time/$ spent on underlying tools/devices Time/$ spent on applica-tion value Therefore huge benefits in • Time to market • Life cycle software costs • Number of NPU experts needed • Scalability to new silicon (up and down)
Comer Bump in the Wire Example Complete PPL program (the only code you write) is Define port80counter=”Rg20” Event(0) Rule EQ(IP_PROT,TCP) EQ(L4_DPORT,80) ADD(port80counter,1) Rule FORWARD Write the data-plane code that examines each IP packet to determine if it is TCP and destined for port 80 (HTTP). Count them. And forward all packets. • A major undertaking if you sit down to attempt this in an assembly-language or C program. • The closest thing we know about (Agere’s FPL) was 76 FPL lines in Agere’s submission to Comer’s web site, and we found two serious bugs in Agere’s code that don’t exist in the PPL code: • If a packet is a fragment, the Agere code can mistake it for something with a TCP header • If a packet’s layer 3 or 4 headers are malformed or malicious, behavior is unpredictable
PPL PPL program Policy Policy … Event Rule Rule … Event Rule Rule … Event Rule Rule … Event Rule Rule … Event Rule … Logical port 82 • Powerful, easy to use, functional (not procedural) language • Main elements - rules, policies, events • Rule expression(s) action(s) • Event: rules that are processed together • Policies: major algorithms and state machines • Defines strong concurrency, yet hides all parallelism in the NPU • All rules are evaluated concurrently. The actions of true rules in an event are processed sequentially. • Events are processed concurrently (i.e., rules in separate events are processed concurrently). • Multiple instances of the same event also process concurrently. Rules apply policies Logical ports 4-7 Logical ports 0,1 Exceptions Start up
Example of a Rule Rule EQ(IP_DEST/16,iptable(1)) EQ(TCP_SYNONLY,1) APPLY(tcpconn) Means: If the upper 16 bits of the IP destination address match entry 1 in array iptable, and if the packet is a TCP packet with only the SYN flag set, apply the policy labeled tcpconn
Easy and Powerful • Highly robust – prevents many errors and security holes • Layer-2 interfaces are built in • Ethernet, PoS, ATM, SPI4, CSIX, PCI • Many powerful packet-processing elements built in, e.g., • Payload scanning (absolute and regular expression) • Automatic connection lookup/tracking (e.g., TCP, SIP) • Content-addressable tables • Rate computation • Encryption/authentication • High-speed, large database, multipattern matching • Header insertion/stripping • Management of, and operations on, superpackets • Interface to non-PPL programs in data-, control-, or mgmt plane
PPL Rules Rule expression expression … action action … Expression examples Value examples (used in expressions, actions, policies) Action examples
Complete Example Define myregex = “re “”GET.*?redirect.html[[:space:]]*?HTTP/1.*?Cookie:””” Source_track: Policy ASSOCIATE NUMBER(100000) SEARCHKEYS(IP_SOURCE) Event(0) Rule EQ(IP_PROT,TCP) EQ(L4_DPORT,80) SCAN(myregex) APPLY(Source_track) Rule Forward Stop This is the complete program – i.e., this is the entirety of what you’d have to write for the data plane of the Intel IXP 2xxx Application: Examine all packets going to TCP port 80 to see if they are a GET HTTP transaction with a URL ending with ‘redirect.html’ and containing a session cookie. For each that is found, store its IP source address in a table (unless it previously exists in the table). Then forward the packet.
PPL DeviceMap Statement How one describes their hardware to the virtual machine and controls configuration and mappings. DeviceMap NPU(2850,1400) AVAILABLE_PROCESSORS(1,15) PPL_PROCESSORS(ER(10%),AE(70%)) PACKET_MEM(DRAM,128000) CONNECTIONS_MEM(DRAM,16000) ARRAY_MAP(SERVLIST,0,ext_$$pdkserv) LINK(0,inout,GE_ON_SPI,0,1518,0,0, 0,0,IXF1010,0) LINK(156,out,PCI) PROG(excep_recorder,CONTROL) NPU is IXP2850 with clock speed of 1400 MHz Microengines 1-15 are available to PPL virtual machine (meaning 0 is being reserved for something else) Follow suggestion of allocating 10% of microengine cycles to Ethernet receive, 70% to PPL action processing, and best use of remaining 20% Allow 128 MB for packet memory in DRAM. Allow 16 MB for connection tables in DRAM. For the array SERVLIST in the PPL program, physically map it to control-plane symbol ext_$$pdkserv) Define a network interface as logical port 0; it is GigE SPI-4 port 0 and port 0 in MAC IXF1010 Define logical port 156 as an output only port over PXD Define a control-plane interface name to which the PPL PROGRAM policy can invoke
Interfacing to Outside Programs • FORWARD packet • PROGRAM to invoke XScale program (RPC) • Share memory • FORWARD packet • PROGRAM to invoke remote program Intel Portability Framework and NPF APIs Software on an IA host processor Software on XScale control plane PPL program • Share memory • Enqueue on PPL VM input ring • Send packet to PPL event • Send packet to anywhere PPL program can • Invoke PPL event (RPC) • Enqueue on a ring • Share memory • Share memory • Enqueue on PPL VM input ring Custom or customer NPU microcode
PPL Summary • Powerful, easy to use, functional (not procedural) language • Main elements are rules, policies, events • Defines strong concurrency, yet hides all parallelism in the NPU • Highly robust – prevents many errors and security holes • Many powerful packet-processing elements built in, e.g., • Payload scanning (absolute and regular expression) • Automatic connection lookup/tracking (e.g., TCP, SIP) • Content-addressable tables • Rate computation • Encryption/authentication • High-speed, large database, multipattern matching • Header insertion/stripping • Management of, and operations on, superpackets • Interface to non-PPL programs in data-, control-, or mgmt plane
Complete Software Solution PPL debug GUI • Be running in, literally, days • No need to use Intel SDK, Intel microcode, learn the IXP programming details, etc unless you want to write low-level microcode PPL compiler PPL transactor Windows or Linux computer Customer PPL PPL applications e.g., signature analysis, IPv4/v6 translation, layer 7 content switch, encryption gateway, … Customer control plane software PPL virtual machine Control plane interfaces (ie,NPF APIs) Customer mgmt plane software Receivers/transmitters for Ethernet, CSIX, PCI, POS/PPP, … Extensions for high-speed multi-pattern searching, IPSec, superpackets, PXD, etc PPL system initialization, PPL debug, logging, stats PXD high-speed packet interface NPU data-plane microengines XScale “Pentium”
Translated to Time and Cost Time to Market Develop NPU hardware and data-plane software from scratch Deploy off-the-shelf NPU hardware and PPL for data-plane software Months Functional, measurable, live prototype available NPU Software Development Cost $ million NPU Software Life-Cycle Cost* Subscription and royalty * includes maintenance, product enhancement, one port to different NPU model $ million