290 likes | 534 Views
Semantic Multicast. Maximilian Ott Semandex Networks, Inc. WINLAB, Rutgers U. Creating Networks That Know™. Semantic Multicast is the next evolution. Content. XML Routing. Name. URL Routing. Address. IP Routing. Information Networks. All <Product Return>. <Product Return>
E N D
Semantic Multicast Maximilian Ott Semandex Networks, Inc.WINLAB, Rutgers U.
Creating Networks That Know™ • Semantic Multicast is the next evolution Content XML Routing Name URL Routing Address IP Routing
Information Networks All <Product Return> <Product Return> <Customer> <Name> <Address> <Product> <Type> <Model> <Reason> Customer Service My <Customers> AND within 10 miles Sales Rep. My <Product.Model> AND <Reason> == Fault Product Manager
Tailoring Information to the User’s Needs • XML profiles filter information within the network, meeting need-to-know, bandwidth and device constraints
Scalable Real-time Information Delivery User Interest Profile XML Descriptor Content Router B Content Router A Content Provider
1 3a P_AB 4 2 3b Distributing Profiles U1 A B U2
Example: Interest Descriptor 1: <ri:RI ..> 4: <ri:And> 5: <d:Locale> 6: <d:Position> 7: <ri:distance within=’10’ units=‘mi’> 8: latitude=‘40’ longitude=‘-74.4’/> 9: </d:Position> 10: </d:Locale> 11: <d:Genre> 12: <ri:Or> 13: <ri:string-match>Jazz</ri:string-match> 14: <ri:string-match>Rock</ri:string-match> 15: </ri:Or> 16: </d:Genre> 17: </ri:And> 18: </ri:RI> 5: <d:Locale> 6: <d:Position> 7: <ri:distance within=’10’ units=‘mi’> 8: latitude=’40’ longitude=‘-74.4’/> 9: </d:Position> 10: </d:Locale>
3 MD ? 1 2 Spotter Need a map Finding Information & Keeping it Current • Netlink supports distributed search with real-time content updates from relevant sources LiveDocument Repository
Deploying Networks That Know™ • An overlayhierarchy of network servers provides scalableinformation connectivity Intranet Internet Extranet
Why not IP Multicast? • Need to map multi-dimensional information space into 1-D address range • “Red Mustang, built 1972-75, < $4K” • Requires “carving-up” the information space • Wide channels => receive lots of junk • Narrow channels => publish on multiple channels, potentially receive multiple times • MC address does NOT contain semantic meaning • Cable channel syndrome • Nobody is using it, anyway
Can’t work – Too many cycle per packet 3Ghz CPUs and counting Why not here? Conventional Wisdom CPU cycles/packet Packet/sec
Test Environment for Full Routing Routing Plane XMLParser Filter Producer Consumer SemNativeIn SemNativeOut SemSock SemSock Packets Packets SemDatagram SemDatagram SemDatagram SemDatagram
System Performance Throughput Window Size vs. Throughput (Mbps) 47.3357 47.5 46.86 47 46.5 24 n n 32 n n 46 46 64 n n 45.53 128 n n 45.5 45 44.5 WindowSize Note: Ethernet port needs to handle TWICE the traffic!
What about Security? • No end-to-end secrets shared between producers and consumers • Provides link-by-link validation • Full encryption is too expensive • 0.00581 bytes/us/MHz (99.9% linear) • 26.6 Mbps on target platform (single fan-out) • SHA-1 Signature • Signature only needs to be checked / 257Mbps • Encrypt signature + sequence counter
Security needs CPU cycles • Encrypting small (32byte) headers with a shared-secret key prevents malicious agents from inserting packets into the data stream • Signing packets with SHA-1 signature prevents unauthorized changes to the content of the packet (e.g. to change the Content Descriptor) • Overall impact is a drop in throughput from 47Mbps to 37 Mbps
Nice numbers, but does it scale? • Processing is per packet with no inter-packet state • Fully parallelizable • Scales with CPU speed? • Investigate individual aspects of the routing procedure • Use sample XML data to generate raw data • Characterize throughput by fitting lines, or curves to the data • Combine results into a single formula • Verify formula using production data
Putting it all together • InputTime = 0.1 PSize2 + 8.9 PSize + 45.2 • ParseTime = 0.8 CDesc2 + 49.1 CDesc • MatchTime= 23.16 CDesc NCsmrs • OutputTime = 26.5PSize (0.55 + 0.4NCsmrs2)
CANADA Adaptor: CANMaps (Vector/Raster Maps) UNITED KINGDOM Adaptor: GRBMaps (Raster Maps) USPACOM GCCS TRACKS Adaptor: OTH_G HANSCOM Adapter: MIDB Database MIDB DAHLGREN, VA Adapter:AusCOINS Netlink XML Information Appliance NIMA Adaptors: USAMaps (Vector/Raster Maps) USA_IPL (Imagery) USAMidb AUSTRALIA Adapter: AusCOINS COINS Does it work out in the wild?
Other Application Scenarios • Knowledge Sharing • Extensions to IM • Media distribution to mobile device (pro-active caching) • Sensor networks (focus on information aggregates)
Can’t be all Sugar & Spice • Current routing algorithm requires spanning tree • Not very robust • Has self-discovery and self-healing, but with delay • Centralized Management • Is a requirement for current customer set • Need to be extended to autonomous regions • Tunnel protocol is based on UDP • Nobody likes UDP • Works really bad on bad channel
Information Food Chains Interpreted data feeds back into the same system, but using a different schema Sensor
Autonomous Living “Global”Information Space Consumes Information & Request Produces Information & Requests Autonomous Entity Observes / Affects Environment
Conclusion • Routing based on content descriptor • Symmetric use: distribution and discovery • Similar scaling properties as IP • Network is generic; specialization on the edge • Fully parallelizable, relatively linear • Works, in use, makes money • Needs more applications, more robust topologies
Semantic Multicast Max Ott max@semandex.net http://www.semandex.net/whitepaper/semantic_multicast.pdf