190 likes | 282 Views
George Porter With Christoph Schuba and Randy Katz UC Berkeley OASIS Summer Retreat 2004. cPredicates*: Software-controlled L7 Router Classification. * short for “Classification Predicates”. cPredicates Overview. What is it?
E N D
George Porter With Christoph Schuba and Randy Katz UC Berkeley OASIS Summer Retreat 2004 cPredicates*: Software-controlled L7 Router Classification * short for “Classification Predicates”
cPredicates Overview • What is it? • Router building block that enables software-based network services in PNE*s without datapath technology knowledge • What does it enable? • PNE switching decisions based on L7 features • HTTP Url, iSCSI command type, XML object • What is new about this approach? • We don’t look at the whole stream • cPredicates abstract away NPUs, FPGAs, etc. with a clean interface that supports multiple services * PNE = Programmable Network Element
Motivation: Horizontally Scaled Systems (HSS) • Desired policy: • HTTP: “GET /images/*” across Web 2 and Web 3 • XML: “WorkOrder” objects to App 2 • ISCSI: Logical Unit (LUN) 3 to LUN 4 Network device needs visibility into the application layer Web 3 App 3 DB 2 Web 2 LB / Firewall App 2 Storage DB 1 Web 1 App 1 Tier 0 Tier 1 Tier 2 Tier 3 Storage
packets LB / Firewall Switch Fabric Layer 4 HTTP Web 1 Web 2 Web 3 Web 4 CPU Layer 7 XML, SOAP App 1 App 2 App 3 JDBC DB 1 DB 2 iSCSI Storage Abstracted View • Switch Fabric: 5-tuple based (L2-L4) • Packets sent to CPU for L7 processing, but • Too expensive to send all packets through CPU • So instead: • CPU installs a 5-tuple map into the switch fabric • Rest of flow handled by switch fabric alone
What is the problem with that? Pipelined HTTP: iSCSI: XML: Operation: WRITE Block: 13210 LUN: 3 Length: 64 <workorder> <order number=1> order information </order> <order number=2> order information </order> <customer> <name>Oski</name> </customer> </workorder> <customer id=3283> <name>UC Berkeley</name> </customer GET /index.html HTTP/1.0 ... HTTP/1.0 200 OK html data ... ... </html> GET /images/top.gif HTTP/1.0 ... HTTP/1.0 200 OK image data GET /images/sidebar.gif HTTP/1.0 ... HTTP/1.0 200 OK image data data Operation: READ Block: 5622 LUN: 4 Length: 32 data Operation: WRITE Block: 912
Example of what goes wrong GET /index.html HTTP/1.0 ... HTTP/1.0 200 OK html data ... ... </html> GET /images/top.gif HTTP/1.0 ... HTTP/1.0 200 OK image data GET /images/sidebar.gif HTTP/1.0 ... HTTP/1.0 200 OK image data
Example of what goes wrong Sub- request GET /index.html HTTP/1.0 ... HTTP/1.0 200 OK html data ... ... </html> GET /images/top.gif HTTP/1.0 ... HTTP/1.0 200 OK image data GET /images/sidebar.gif HTTP/1.0 ... HTTP/1.0 200 OK image data #1 #2 #3
Example of what goes wrong Sub- request GET /index.html HTTP/1.0 ... HTTP/1.0 200 OK html data ... ... </html> GET /images/top.gif HTTP/1.0 ... HTTP/1.0 200 OK image data GET /images/sidebar.gif HTTP/1.0 ... HTTP/1.0 200 OK image data #1 #2 #3
Example of what goes wrong <5 tuple> Server 3 Switch Fabric Sub- request GET /index.html HTTP/1.0 ... HTTP/1.0 200 OK html data ... ... </html> GET /images/top.gif HTTP/1.0 ... HTTP/1.0 200 OK image data GET /images/sidebar.gif HTTP/1.0 ... HTTP/1.0 200 OK image data CPU #1 Action: <none> Action: Send to Server 3 #2 #3
Problem Statement: Need efficient, selective processing HTTP: iSCSI: XML: Operation: WRITE Block: 13210 LUN: 3 Length: 64 <workorder> <order number=1> order information </order> <order number=2> order information </order> <customer> <name>Oski</name> </customer> </workorder> <customer id=3283> <name>UC Berkeley</name> </customer GET /index.html HTTP/1.0 ... HTTP/1.0 200 OK html data ... ... </html> GET /images/top.gif HTTP/1.0 ... HTTP/1.0 200 OK image data GET /images/sidebar.gif HTTP/1.0 ... HTTP/1.0 200 OK image data data Operation: READ Block: 5622 LUN: 4 Length: 32 data Operation: WRITE Block: 912
New Idea: cPredicates • Insert a predicate P() with the 5-tuple • P() evaluated on each packet • P() satisfied packet sent to CPU • Otherwise, handled by switch • Result: CPU can now selectively process flow • Without knowledge of switch’s NPU, FPGA, etc. • P()s enabled by NPU advances (<5 tuple> server i), P() Switch Fabric CPU
What does P() look like?Which packets go to the CPU? • P() == true • All packets • P() == false • No packets (most common today) • P() == exact_match(pattern) • Any portion of packet matches pattern • P() == range_match(x <= seqnum <= y) • Specific field in packet lies in range (x,y) • P() == regexp(pattern) • Packets matching regular expression • Others, depending on HW availability • New opportunity to define minimum list of P()s needed
cPredicates enable these protocols: We can now support: • Pipelined HTTP • P() == regexp(GET * HTTP/1.0) • iSCSI Storage Protocol • P() == range_match(x <= tcpseqnum <= y) • XML object switching • P() == exact_match(<workorder>)
New services with software • Unlike before, now CPU-based network services can selectively process L7 flows • Pipelined HTTP, iSCSI, XML, etc. • PNE datapaths export a list of predicates • Service writers only care about predicates, not underlying technology • Clean, abstracted interface can enable new innovations
Storage Example HighPrio P() is (seqnum == X) high? Switch Fabric 2 Mb/s LowPrio CPU • Action in switch marks packets as HighPriority or LowPriority • iSCSI headers only sent to CPU (data goes through switch) • CPU updates mark that is set by the switch on subsequent data packets • iSCSI: < 5% of packets go to CPU
Storage Example File5 File7 File1 File3 File2 File4 File6
Summary • Heterogeneous, HSS systems In-network selective protocol processing • cPredicates separate data from control in datapath • CPU controls which features of L7 protocol it sees • Without knowledge of underlying technology • Enables efficient, software-based processing of whole flows, not just the first part of them • Enables switching based on app-level features • Low overhead and latency because of advances in NPU technology
Future Directions • Develop predicates based on H/W devices I have access to: • Nortel 2424, Sun Puma, Sun Nauticus, MIT Click • Evaluate impact on application perfomance • Focus on video, XML, web-services and HSS • Recommend new H/W functionality to support interesting predicates • Prototype in Click, emulate new HW architectures • Recommend canonical set of P()s
Questions? • Thanks to Christoph Schuba, Mohamed Hefeeda, and Sumantra Kundu