170 likes | 330 Views
Offloading Multimedia Proxies using Network Processors. A presentation by Øyvind Hvamstad 19. Nov. 2004. IXP1200. Domain overview. Distributed media on demand (MoD) Standarized protocols (RTSP/RTP) Utilizing proxies -with Network processing units. OUR FOCUS. Stream setup with RTSP.
E N D
Offloading Multimedia Proxies using Network Processors A presentation by Øyvind Hvamstad 19. Nov. 2004
IXP1200 Domain overview • Distributed media on demand (MoD) • Standarized protocols (RTSP/RTP) • Utilizing proxies • -with Network processing units. OUR FOCUS Stream setup with RTSP Stream transport with RTP Offloading Multimedia Proxies using Network Processors
A one hour long MPEG-2 movie at an average bit-rate Of 3.5 Mbps takes up 1.6 GB worth of storage. DivX can reduce this by a factor of 6.5. Thus reducing the size to 246 MB Multimedia stream characteristics • High data-rates • Requires much bandwith and storage space. • Depending on codec, quality and length. • Soft real-time requirements • Percieved quality is sensitive to jitter. • Access Patterns • Zipf distribution (10% requested 90% of the time) • Newly published material tend to be popular. • Consumed from start to end or quickly aborted. • ”write-once-read-many” Offloading Multimedia Proxies using Network Processors
The MoD Proxy • Deployed in client vicinity to: • Reduce client startup latency • Reduce server load • Reduce network load • Must face the challenges of: • Many concurrent clients • Possible high aggregate network load • CPU intensive tasks Offloading Multimedia Proxies using Network Processors
Caching Protocol translation Transcoding Re-encoding Encryption General functions Access control QoS mechanisms Traffic engeneering Forwarding data and requests MoD proxy tasks OUR FOCUS Offloading Multimedia Proxies using Network Processors
Exactly what? • Offload application layer packet forwarding. • Free cycles on the host CPU for other tasks. • Show improvements compared to a traditional architecture. • Reduced latency • Provide a basis for future work in the area. • Extensions • Caching • Zero copy Offloading Multimedia Proxies using Network Processors
Design Cache control lookup() To/from client To/from server RTSP server RTSP client insert() remove() lookup() insert() remove() lookup() signal() Session Mgmt. control-plane data-plane lookup() fast_forward() To client From server RTP server RTP client Cacher fetch() write_through(async) Offloading Multimedia Proxies using Network Processors
Prototype implementation Intel ACEs RTSP Proxy StackACE Linux run-time IXA run-time Ingress coreACE Classifer/ RTP-fwd coreACE Egress coreACE StrongARM Microengines control-plane data-plane Ingress microACE Classifer/ RTP-fwd microACE Egress microACE µe 0 µe 1 Offloading Multimedia Proxies using Network Processors
Measure the processing overhead during RTP-forwarding. Cycle precision Probes at different locations in the code. Minimal probe overhead. Ingress microACE Process microACE Egress microACE Probe Probe Experiments Switch Dell GX260 Darwin streaming server rclient.py eth0 IXP1200 Proxy Offloading Multimedia Proxies using Network Processors
Results • Offloading effect • 100% of all network traffic processed by the StrongARM and the microengines. • Prototype performance • Processes every RTP packet using about a tenth of the cycles compared to a traditional architecture. (Delay reduced from ~80 µs to ~8 µs @ 232 Mhz) Offloading Multimedia Proxies using Network Processors
Extensions • Write-through caching • Make a multicaster by copying packets • Use a lazy-copy strategy to reduce copy operations pr. packet. • Send payload copy to the host. • Zero-copy-path • Batched transfer to the host. • Scatter-gather DMA to assemble packet payloads in host memory. • Large disk-requests. Offloading Multimedia Proxies using Network Processors
Conclusion • Prototype relevance • Data will always flow through the proxy • The forwarder processes an RTP packet efficiently compared to a traditional architecture. • Is thus an orthogonal way to improve MoD proxies. • Low resource utilization leaves room for extensions. Offloading Multimedia Proxies using Network Processors
Other applications • The idea might be more applicable in other, more real-time areas. • Online games • Proxy holds game state • Might also need to forward real-time data while playing. • Live voice communication • Other urgent game data • Video conferences • Node that handles overlay multicasting. • Real-time data forwarded with low latency. Offloading Multimedia Proxies using Network Processors
Linear modulo operator Intel Assembler macro ANSI C function #macro Mod[out_z, in_x, in_y].local xm alu[xm, --, B, in_x]Loop#: alu[out_z, xm, -, in_y] br<0[End#] alu[xm, --, B, out_z] alu[--, xm, -, in_y] br>=0[Loop#]End#: alu[out_z, --, B, xm].endlocal#endm int mod(int x, int y) { int z;top: z = y – x; if (z < 0) z = x; x = z; if ((x - y) > 0) goto top return z;} Offloading Multimedia Proxies using Network Processors
Basic hash function Intel Assembler macro ANSI C function #macro Hash[out_z, in_x, in_seed] .local x start immed[start,START] alu[x, in_x, -, start] ; alu_shf[x, --, B, x, >>1] Mod[out_z, x, in_seed].endlocal#endm int hash(int x, int y) {x = x – START; x = x / 2; return x % seed;} Offloading Multimedia Proxies using Network Processors
Incremental checksumming #macro IncrementCksum[out_newsum, oldsum, old, new].local sum tmp mask immed[mask, 0xffff] alu[sum, --, ~B, oldsum] alu[sum, sum, -, old] alu[sum, sum, +, new] alu[tmp, sum, AND, mask] alu_shf[sum, tmp, +, sum, >>16] alu[tmp, sum, AND, mask] alu_shf[sum, tmp, +, sum, >>16] alu[sum, --, ~B, sum] alu[out_newsum, sum, AND, mask].endlocal#endm sum = ~oldsum;sum = sum - old;sum = sum + new;tmp = sum & 0xffffsum = tmp + (sum >> 16);tmp = sum & 0xffff;sum = ~sum; sum = tmp + (sum >> 16); Offloading Multimedia Proxies using Network Processors
Just can’t get enough, huh? • /hom/~oyvindh/thesis.pdf • CVS repository • :pserver:hic.no/cvs • Module: thesis • User: anonymous • Pwd: <empty> • Should be up in a few days • Have just moved to a new location Offloading Multimedia Proxies using Network Processors