540 likes | 622 Views
Analysis and Implementation of Multiplexing Techniques in Connection-Oriented Communication Networks. Ph.D. Final Examination August 8, 2006. Tao Li (tl8g@virginia.edu) Department of Electrical and Computer Engineering SEAS, University of Virginia. References.
E N D
Analysis and Implementation of Multiplexing Techniques in Connection-Oriented Communication Networks Ph.D. Final Examination August 8, 2006 Tao Li (tl8g@virginia.edu) Department of Electrical and Computer Engineering SEAS, University of Virginia
References • T. Li, D. Logothetis, M. Veeraraghavan, “Analysis of a polling system for telephony traffic with application to wireless LANs,” IEEE Transactions on Wireless Communications, vol. 5, pp. 1284-1293, June 2006. • T. Li, M. Veeraraghavan, “Resource allocation for a polling system with application to wireless LANs,” to be submitted for journal publication. • H. Wang, M. Veeraraghavan, R. Karri, T. Li, “Design of a High-Performance RSVP-TE Signaling Hardware Accelerator,” IEEE Journal on Selected Areas in Communications (JSAC), vol. 23, no. 8, pp. 1588-1595, August 2005. • H. Wang, M. Veeraraghavan, R. Karri, T. Li, “Hardware-Accelerated Implementation of the RSVP-TE Signaling Protocol,” in Proc. of IEEE ICC2004, June 20-24, 2004, Paris, France. Ph.D. Final Examination
Outline • Background • Problem statement and contributions • Study a polling system with vacations • Implementation of a signaling control card • Conclusions Ph.D. Final Examination
Background • Applications have diverse Quality of Service (QoS) requirements (bandwidth, delay, loss, etc.) • deterministic QoS guarantees: mission-critical control • statistical QoS guarantees: most audio/video applications • No specific requirements: best-effort applications • Two types of networking technologies • Connectionless (CL): Internet, best-effort type of service • Connection-Oriented (CO): support of QoS • Circuit-switched networks: SONET, WDM, etc. • Packet-switched networks: ATM, MPLS, etc. Ph.D. Final Examination
Background (more) • Chief characteristics of CO networks • Resources are reserved prior to data transfer in a call admission control (CAC) phase • Resources are left idle during connection setup phase • Per-connection state maintenance at control-plane • How to reserve resources? – through signaling protocols • RSVP-TE, PNNI, SS7, etc. Architecture of a CO switch Ph.D. Final Examination
Background (more) • In circuit-switched networks • Reserve a dedicated circuit for a connection • In packet-switched networks • Reserve bandwidth, buffer space, etc., for a connection • Data plane: packet classification, policing, scheduling, buffer management • How much resources should be reserved? • Depends on service model (hard QoS or soft QoS), traffic characteristics (burstiness), buffer size, scheduling algorithms Ph.D. Final Examination
Background (more) • Multiplexing techniques in shared-medium based access • Connection-Oriented • Circuit-switched networks: FDMA, TDMA • Packet-switched networks: Polling, scheduling-based access • Connectionless • Random access Ph.D. Final Examination
Problem statement Our mission: • Study a polling system for QoS provisioning • With application to IEEE 802.11 • Target real-time application: telephony • A data-plane problem • Demonstrate that signaling protocols, can, in spite of their complexity, be implemented in hardware • Performance gain in terms of call-handling capacity and message process delay • A control-plane problem • Supported by NSF, DOE Ph.D. Final Examination
Contributions • Study of a polling scheme • CDF of delay in a single queue scenario • Assume a continuous-time Markov Modulated Fluid model • Can be used to approximate the CDF of delay in certain multiple-queue case • Voice capacity and delay bounds (deterministic service) • For the MMF model or a discrete-time Markov ON/OFF model • Allow heterogeneity • Voice capacity (statistical service) • MMF model: results obtained by simulations • Resource allocation (statistical service) • Assume a discrete-time Markov ON/OFF model • Derive approximations for tradeoff between service degradation measure (overflow probability, or packet loss ratio) and resource allocation Ph.D. Final Examination
Contributions (more) • Implementation of a signaling control card • Schematic design at a later stage • Power regulation module • Prior work completed by collaborators (Haobo Wang, Liji Wu) • Collaborated with Appli-CAD Inc. for PCB design • Provided a reference design for 1.25Gbps signal path • Examination of placement and route • Design or VHDL implementation of some functional modules • Configuration module, PCI interface module, FIFO interface unit, switch-fabric interface unit • Software design (device driver; contributed to a message generator) • Debugging (board and VHDL) Ph.D. Final Examination
Overview • Background • Problem statement and contributions • Study of a polling system with vacations • Motivation and related work • System model • Analysis with a continuous-time MMF model • Analysis with a discrete-time Markov model • Implementation of a signaling control card • Conclusions Ph.D. Final Examination
Motivation • Several communication systems simultaneously support CO and CL modes of operation • IEEE 802.11 • polling and random access • DOCSIS and IEEE 802.16 • Extended Real-Time Variable Rate and Best-Effort services • In CO mode: scheduling-based channel access Scheduler downstream upstream Ph.D. Final Examination
Motivation (more) • Problem: queue status info is distributed among stations for the upstream direction • Instantaneous queue status not available to scheduler • Can not directly use scheduling algorithms that need arrival times, queue occupancy, or packet size • Continuous exchange of queue status info can be expensive • Wireless bandwidth is scarce Ph.D. Final Examination
Motivation (more) • Polling emerges as a choice • Serve all queues in a round-robin order • does not require queue status information • Easy to implement: O(1) time complexity • Trade efficiency for timeliness (hard) • Transmission of a poll signal consumes bandwidth • If interpoll time bounded, delay also bounded • suitable for delay-sensitive applications, like telephony • Question: how many calls can be admitted? Or how much resource should be allocated for voice calls? Ph.D. Final Examination
Related work • Papers on general polling systems • Poisson arrival process; do not consider voice traffic • Papers on QoS provisioning in wired and wireless networks • Do not specifically address the polling scheme considered in our work • Papers on voice support over MAC protocols • Do not specifically address the polling scheme • Papers on voice support over IEEE 802.11 polling mode • Largely simulation-based Ph.D. Final Examination
Frame System model Assume a superframe structure • Polling period: supports voice calls • Vacation period: other resource sharing schemes • Partition between polling and vacation: vacation is at least θ×TS • VS: vacation stretch VS Polling period Vacation Vacation Foreshortened polling period Superframe length: TS Ph.D. Final Examination
System model (more) • Polling order • Round-robin with a restriction: each queue can be served at most once in a polling period • Walk time – Twalk • Time needed for the server to move from one queue to another; models physical and MAC layer overheads • Service discipline – gated-service • Pack all voice packets into one MAC frame when responding to a poll Ph.D. Final Examination
Overview • Background • Problem statement and contributions • Study of a polling system with vacations • Motivation and related work • System model • Analysis with a continuous-time MMF model • Source model • Delay analysis in a single queue case • Multiple-queue analysis and simulation • Analysis with a discrete-time Markov model • Implementation of a signaling control card • Conclusions Ph.D. Final Examination
a b ON OFF Source model • Markov Modulated Fluid model • Continuous in time • a and b are transition rates • When ON, a bit stream is created at a constant-rate c; when OFF, silence • Average ON time: 352ms • Average OFF time: 650ms • May and Zebo 1968 model • QoS requirements • Stringent in delay • Can tolerate a small loss ratio Ph.D. Final Examination
Delay analysis in a single queue case • Delay of interest: DW=DQ+DS • DQ: queueing delay • DS: service time, depends on service rate R and data size • DS=0: empty packet, not of interest • First, compute the PDF of TI given TI=TS+(stretch2 - stretch1) • Assume: stretch1 and stretch2 are i.i.d. R.V.s with known PDF • Second, compute P{DQ≤q|TI=t, nonempty packet}, and then obtain P{DQ≤q| nonempty packet} by unconditioning Ph.D. Final Examination
Delay analysis (more) • Third, compute P{Z≤z|DQ=q} and P{DS≤s|DQ=q} • Z: total time spent in the ON state during DQ • Can be solved with a uniformization technique • Z can be linked to DS by DS=Zc/R • c: source rate; R: service rate • Finally, combine all together, given DW=DQ+DS • P{DQ≤q| nonempty packet} obtained in the second step • P{DS≤s|DQ=q} obtained in the third step Ph.D. Final Examination
Delay analysis (more) • CDF of DW with TS as a parameter • Twalk and C are set to 0.23ms and 8.5Kbps, respectively All numerical results: assume IEEE 802.11b PHY Ph.D. Final Examination
Multiple-queue case • Deterministic service • Each queue is guaranteed to be polled in a superframe • Number of queues N≤ Np (voice capacity) • Referred to as small-N regime of operation • Statistical service (when N > Np) • Service degradation: not guaranteed to be polled in each superframe; statistical QoS guarantees • Statistical multiplexing gain since N > Np • Referred to as large-N regime of operation Ph.D. Final Examination
Computation of Np: worst-case analysis • Polls in the kth interval: empty packets • Polls in the (k+1)th interval: maximum-sized packets • Vacation stretch: VSmax • Admission condition • Np can be computed iteratively • Delay bound DWmax,i Ph.D. Final Examination
Delay in small-N regime of operation • Simulation results: CCDF of DW; θ, codec rate, and Twalk are set to 0.5, 64Kbps, and 0.23ms, respectively • Implication: delay analysis in the single queue case is a fair approximation, given the range of parameter values under consideration Ph.D. Final Examination
Cost of large-N regime of operation • Simulation: CCDF of delay with N' as a parameter. TS, θ, and codec rate are equal to 30ms, 0.5, and 8.5Kbps, respectively. • Implication of delay spikes: use DWmax as delay threshold, and P{DW>DWmax} as performance measure (Ploss) Ph.D. Final Examination
Statistical multiplexing • Codec rate, Twalk, and stretch are respectively set to 8.5Kbps, 0.23ms, and VSmax • Capacities increases with TS: payload size vs. Twalk • Multiplexing gain is small: large Twalk, small codec rate Simulation results Ph.D. Final Examination
Statistical multiplexing • Codec rate, Twalk, and stretch are respectively set to 64Kbps, 0.13ms, and VSmax • Multiplexing gain is significant: small Twalk, large codec rate • Small Twalk is attainable Simulation results Ph.D. Final Examination
Overview • Background • Problem statement and contributions • Study of a polling system with vacations • Motivation, related work • System architecture • Analysis with a continuous-time MMF model • Analysis with a discrete-time Markov model • Implementation of a signaling control card • Conclusions Ph.D. Final Examination
Assume a discrete-time Markov model • Motivation • Voice traffic needs to be packetized for transmission in a packet-switched network • A discrete-time Markov model is more realistic • Tractability in analysis • Extend worst-case analysis for small-N regime of operation to discrete-time Markov model • We derive voice capacity Nl and delay bound Dbound • Details are omitted • Delay performance is studied through simulations Ph.D. Final Examination
Resource allocation for large-N • Tsrv: the total time spent on N queues in a superframe • Performance criteria: overflow probability • The smallest x satisfying the above criteria is the amount of time that should be allocated for polling period, denoted as Tp(ε) • Difficulty in exact analysis of P{Tsrv}: correlation • Key approximation: correlation between DS,i, i=1,2,…,N, is small. Approximate DS,i, i=1,2,…,N, as i.i.d. R.V.s Ph.D. Final Examination
Analytical approach • Consider a reference service discipline • Does not incur correlation between DS,i • Perform an exact analysis for this reference service discipline • View the results as approximations for the gated-service discipline Reference service discipline: serve 1, 2, 3, but not 4 Ph.D. Final Examination
Analytical approach • Other assumptions: TS=KL; synchronization • First, compute PK(m) for one queue • the probability of m arrivals in K time slots • Using a recursive approach • Then overflow probability • Computational complexity: O(NlogN) with FFT Ph.D. Final Examination
Computation of loss ratio • If the waiting time is too long, packet will be dropped • Define loss ratio as Ploss=E{Nloss}/E{Ntotal} • Nloss : number of lost packets in a superframe • Ntotal : number of created packets in a superframe • Ploss can be linked to overflow probability • Ploss ≤ P{Tsrv>x}/PON, where PON is the probability of a voice source being in the ON state • This approximation of Ploss is not very accurate Ph.D. Final Examination
Computation of loss ratio (more) • For the reference service discipline, an exact computation of Ploss is possible • Computational complexity • O(N2) with direct convolution For Ω: Ph.D. Final Examination
Numerical results Tp: polling period length TS : 30ms Dbound is set to TS+L+2ms L: packetization interval, 10ms Simulation: assume the gated-service discipline; drop the synchronization assumption; allow clock skew and phase error • The approximation of P{Tsrv>x} is satisfactory • Ploss can better approximate the “actual” loss ratio • Cost: computational complexity • Implication: Use P{Tsrv>x} as the QoS measure if computational complexity is a major concern Ph.D. Final Examination
Overview • Background • Problem statement and contributions • Study of a polling system with vacations • Implementation of a signaling control card • Motivation, Related work, and Solution approach • System architecture, block diagram, and picture • Modules of the signaling control card • Performance • Conclusions Ph.D. Final Examination
Motivation • Signaling protocols • Characteristics • Complex (parameters, timers, data-table lookups, keep state information) • Requirement for flexibility • Traditionally implemented in software • Call-handling capacities: 1K calls/second ~ 10K calls/second • Call-setup delay: in the order of hundreds of milliseconds • Sycamore SN16000 switch: per message processing delay 90ms Ph.D. Final Examination
Motivation (more) • Problems with software implementation • Call-setup delay impacts utilization • Hard to meet the requirement for high call-handling capacities in future CO networks • Objective: demonstrate that signaling protocols can be implemented in hardware in spite of their complexity • Reduce call-setup delay by at least two-to-three orders of magnitude • Increase call-handling capacity significantly • Target signaling protocol and switch • RSVP-TE with extensions for GMPLS • SONET switch Ph.D. Final Examination
Related work • TCP offloading engine • Observation: Overhead of TCP/IP processing overwhelms server’s CPU • Solution: Moving TCP/IP processing to a dedicated h/w • Software implementations of RSVP-TE • E.g.: Sycamore SN16000 switch with a per-message processing-delay of about 90ms Ph.D. Final Examination
Solution approach • Manage the complexity of signaling protocols • By only supporting basic and most frequently used messages/parameters in hardware and relegating the rest to software • Define a subset of the signaling protocol for hardware implementation (RSVP-TE with extensions for GMPLS) • Four messages related to connection setup and release: Path, Resv, PathTear, and ResvTear • Support all mandatory objects/parameters and optional parameters needed for SONET switch Ph.D. Final Examination
Solution approach (more) • Meet the flexibility requirement • using reconfigurable Field Programmable Gate Array • FPGA can be reloaded with updated versions • Achieve fast data-table lookups and state maintenance by using Ternary Content Addressable Memory (TCAM) • TCAM: a special memory device designed for data-table lookups • Complexity of a lookup operation: one clock cycle Ph.D. Final Examination
System architecture • Focus on signaling control card • Backplane: often proprietary. We assume PCI bus. • Switch fabric card: assume Vitesse 64x64 STS-12 • Cross-connection rate: STS-1 (51.8Mbps), total bandwidth: 40Gbps Ph.D. Final Examination
Block diagram of implementation Optical fiber PCI bus Gbit Ethernet module Hardware signaling accelerator PCI interface module 5v, 3.3v 5v, 3.3v Power regulation module Configuration module 1.5v, 1.8v, 2.5v Ph.D. Final Examination
Top view of the card Ph.D. Final Examination
Gigabit Ethernet module • Optical-fiber transceiver: convert between optical signals and differential PECL signals • SerDes: convert between serial PECL signals to parallel TTL signals • Ethernet controller: 8B/10B encoding/decoding, MAC layer operations Ph.D. Final Examination
Hardware signaling accelerator module • Hardware signaling accelerator core: all major functions such as message parsing, creating commands for route lookup, state maintenance, and switch-fabric programming, etc. • MAC/Switch fabric/FIFO/TCAM_SRAM interface units: data path, control/timing signals • FIFO: temporary storage of unsupported signaling messages • TCAM/SRAM: Route lookup operation, state maintenance operations Ph.D. Final Examination
PCI interface module • CPU card interface unit: move messages from FIFO to host memory space through Direct Memory Access (DMA) • Switch-fabric control unit: transmit programming command using DMA • Access arbiter: give switch-fabric control unit higher priority • Configuration interface unit: facilitate management of the card • PCI core: provide commonly used functions for PCI accessing Ph.D. Final Examination
Configuration Module • Enable configuration of MAC address, IP addresses, routing table and other data tables • Initialize the GbE controller, SRAM, and TCAM • Create clock and control signals needed for each device Ph.D. Final Examination
Performance • Call-handling capacity • 400K calls/second (Hardware signaling accelerator module) • Software-based implementation: 1K~10K calls/second • 250K calls/second, limited by the 1Gbps link rate • Load on the TCAM: about 6% • Processing delay • Per-message processing delay ≤ 2.4 microsecond • Sycamore SN16000 switch: ≈ 90 ms Ph.D. Final Examination