210 likes | 335 Views
Voice in Packets: RTP, RTCP, Header Compression, Playout Algorithms, Terminal Requirements and Implementations. Jani Lakkakorpi S-38.130 Licentiate Course on Telecommunications Technology April 6, 2001. Problems with Voice over IP.
E N D
Voice in Packets:RTP, RTCP, Header Compression, Playout Algorithms, Terminal Requirements and Implementations Jani Lakkakorpi S-38.130 Licentiate Course on Telecommunications Technology April 6, 2001
Problems with Voice over IP • Received packet stream has to be playout buffered in order to restore the original packet spacing (and possibly packet order). • The large overhead in small VoIP packets. • For example: 24 bytes of payload (G.723.1) and 60 bytes of overhead (RTP/UDP/IPv6) • Header compression is necessary to reduce delay on slow links. • Terminal requirements usually grow with the voice compression ratio. IP Cloud 24 60 Sent packets Received packets Synchronized packets Time
RTP and RTCP • Real-Time Transport Protocol (RTP) provides end-to-end transport functions for applications that transmit real time data, such as VoIP. • RTP does not provide any Quality of Service guarantees but it is only responsible of synchronizing the received packets. • Timestamps and sequence numbers. • Real-Time Control Protocol (RTCP) gives feedback on the quality of data transmission and information about participants of the session. • RFC 1889
V P X CC M PT Sequence Number Timestamp Synchronization Source (SSRC) Identifier Contributing Source (CSRC) Identifiers ... Profile-specific Extensions RTP Header (1) • Version (V, 2 bits) • RFC 1889: V=2. • Padding (P, 1 bit) • If P=1, padding octets at the end of the payload. Last payload octet contains the number of padding octets. • Extension (X, 1 bit) • If X=1, fixed header is followed by extensions (RFC 1889). • CSRC Count (CC, 4 bits) • The number of contributing source identifiers.
V P X CC M PT Sequence Number Timestamp Synchronization Source (SSRC) Identifier Contributing Source (CSRC) Identifiers ... Profile-specific Extensions RTP Header (2) • Marker (M, 1 bit) • Marks significant events such as first packets in talkspurts. • Payload Type (PT, 7 bits) • The format of RTP payload. • Sequence Number (16 bits) • Starts from a random value and is incremented by one for each sent packet. • Used by the receiver to detect packet losses and to restore original packet sequence.
V P X CC M PT Sequence Number Timestamp Synchronization Source (SSRC) Identifier Contributing Source (CSRC) Identifiers ... Profile-specific Extensions RTP Header (3) • Timestamp (32 bits) • The sampling instant of the first payload octet. • Clock frequency defined for each payload type, and the clock is initialized with a random value. • SSRC (32 bits) • Identifies the synchronization source. • Randomly chosen. • CSRC list (0…15 items, 32 bits each) • Identifies the contributing sources for the payload of this packet. • Inserted by mixers.
RTCP • RTCP provides feedback on the quality of data distribution. • RTCP packet types: • Sender Report (SR) contains transmission and reception statistics for active senders. • Receiver Report (RR) contains reception statistics for participants that are not active senders. • Source Description Items (SDES) describe various parameters about the source. • BYE packet is sent when participant leaves the session. • APP: Application specific functions.
V P RC PT=SR=200 Length SSRC of Sender NTP Timestamp, Most Significant Word NTP Timestamp, Least Significant Word RTP Timestamp Sender's Packet Count Sender's Octet Count SSRC_n Fraction Lost Cumulative Number of Packets Lost Extended Highest Sequence Number Received Interarrival Jitter Last SR Timestamp Delay Since Last SR … Profile-specific Extensions RTCP Header: Sender Report (1) • Version (V, 2 bits) • RFC 1889: 2. • Padding (P, 1 bit) • If P=1, padding octets at the end. • Reception Report Count (RC, 5 bits) • The number of report blocks in this report. • Packet Type (PT, 8 bits) • Sender Report: 200. • Length (16 bits) • Includes header & padding. • SSRC (32 bits) • Synchronization source ID of the originator of this report.
V P RC PT=SR=200 Length SSRC of Sender NTP Timestamp, Most Significant Word NTP Timestamp, Least Significant Word RTP Timestamp Sender's Packet Count Sender's Octet Count SSRC_n Fraction Lost Cumulative Number of Packets Lost Extended Highest Sequence Number Received Interarrival Jitter Last SR Timestamp Delay Since Last SR … Profile-specific Extensions RTCP Header: Sender Report (2),Sender Information Section (Only present in SRs) • NTP Timestamp (64 bits) • The wallclock time when this report was sent. • RTP Timestamp (32 bits) • Represents the same time as the NTP timestamp, but with the same units and random offset as in the timestamps of RTP packets. • May be used for synchronization. • Sender's Packet Count (32 bits) • From the start of transmission until the time this report was generated. • Sender's Octet Count (32 bits) • Only payload included.
V P RC PT=SR=200 Length SSRC of Sender NTP Timestamp, Most Significant Word NTP Timestamp, Least Significant Word RTP Timestamp Sender's Packet Count Sender's Octet Count SSRC_n Fraction Lost Cumulative Number of Packets Lost Extended Highest Sequence Number Received Interarrival Jitter Last SR Timestamp Delay Since Last SR … Profile-specific Extensions RTCP Header: Sender Report (3),Reception Report Blocks • One block for each source that we have heard of since the last SR/RR. • SSRC_n (32 bits) • Synchronization source ID of the source that we are reporting about. • Fraction Lost (8 bits) • Cumulative Number of Packets Lost (24 bits) • Since the beginning of reception. • Extended Highest Sequence Number Received (32 bits) • The highest sequence number received in an RTP packet & the corresponding count of sequence number cycles.
V P RC PT=SR=200 Length SSRC of Sender NTP Timestamp, Most Significant Word NTP Timestamp, Least Significant Word RTP Timestamp Sender's Packet Count Sender's Octet Count SSRC_n Fraction Lost Cumulative Number of Packets Lost Extended Highest Sequence Number Received Interarrival Jitter Last SR Timestamp Delay Since Last SR … Profile-specific Extensions RTCP Header: Sender Report (4),Reception Report Blocks • Interarrival Jitter (32 bits) • An estimate of the variance of the RTP packet interarrival time. • Last SR Timestamp (LSR, 32 bits) • The middle 32 bits of the NTP timestamp from the most recent RTCP sender report issued by SSRC_n. If no sender report has been received yet, this field is set to zero. • Delay Since Last SR (32 bits) • The sender of this last SR can use it to compute the round trip time together with the last SR timestamp.
Playout Algorithms (1) • In most packet audio applications, the receiving host has to buffer packets in order to compensate for variable network delay. • Playout delay can be constant or adaptively adjusted. • Adaptive playout delay can be either per-talkspurt or per-packet based: • In the former approach, playout delay remains constant throughout the talkspurt and the adjustments are done between talkspurts. • The latter approach introduces gaps in speech not suitable for VoIP. • There is a tradeoff between packet playout delay and packet playout loss. • If constant playout delay is too short or adaptive algorithm reacts slowly to delay "spikes", packets are lost.
Playout Algorithms (2) • Here we present a simple algorithm for adaptive playout delay adjustment: • For each received packet (except the first one), waiting time in the playout buffer is calculated with the following formula: Twait = (TimeStampi - TimeStampi-1) - (ReceivedAti - PlayAti-1) • If the result is negative, packet has arrived too late and it is discarded. Otherwise, packet is played out at:PlayAti = ReceivedAti + Twait, i • Whenever playout delay is adjusted, it will be the maximum of the initial playout delay and the current playout delay subtracted by the minimum Twait of the latest measurement period. • The following events trigger the delay adjustment process: • If N or more packets among the last M packets (measurement period) arrive late, playout delay is adjusted upwards at the next talkspurt. • Similarly, if M successive packets have been received all in time, playout delay is adjusted downwards at the next talkspurt.
Playout Algorithms (3) • Example: • First packet of the connection arrives at 0 ms. It has a timestamp of 10 ms. Waiting time for the first packet is set to, for example, 30 ms. (We don't assume that the sender and receiver clocks would be synchronized.) • Second packet of the connection arrives at 35 ms. It has a timestamp of 40 ms. Waiting time is calculated in the following way: Twait = (40 - 10) - (35 - 30) = 25 ms. • Third packet of the connection arrives at 80 ms. It has a timestamp of 70 ms. Waiting time is calculated in the following way: Twait = (70 - 40) - (80 - 60) = 10 ms. Sent packets TS=70 TS=40 TS=10 Received packets T=80 T=35 T=0 T=90 T=60 T=30 Synchronized packets
Playout Algorithms (4) • Example (continued): • Too many packets have been lost during last measurement period It is time to adjust delay: • Let's assume that minimum Twait of the latest measurement period is -5 ms. • We subtract this value from the waiting time of first packet of next talkspurt: Twait = (TimeStampi - TimeStampi-1) - (ReceivedAti - PlayAti-1) - (-5 ms). • Example values: Twait = (3020 - 2000) - (3030 - 2020) - (-5 ms) = 10 + 5 = 15 ms.
Header Compression • TCP header compression: RFC 1144. • RTP header compression: RFC 2508. • Basic idea: Since the difference in successive RTP packets is often constant, it is enough to convey an indication that the second-order difference was zero. Next packet header can thus be constructed from the previous one by adding the first-order differences. • Other proposals: ROCCO (Ericsson), ACE (Nokia). • Should perform slightly better than the mechanism described in RFC 2508.
Version IHL Type of Service Total Length Packet ID Flags Fragment Offset Time to Live Protocol Header Checksum Source Address Destination Address RTP/UDP/IP Header Compression (1) • In IPv4 header, only the total length, packet ID, and header checksum fields typically change. • Total length can be excluded (provided by the link layer). • Header checksum can be dropped, too. Link layer provides good error detection. • Changes in packet ID are transmitted. Usually packet ID is incremented by one for each packet. • In IPv6 base header, only the payload field changes.
Source Port Destination Port Length Checksum RTP/UDP/IP Header Compression (2) • In UDP header, port numbers are not likely to change during VoIP connection. • Length field is redundant with with the IP total length field and the length indicated by the link layer. • If source generates UDP check-sums, they must be sent intact in order to preserve lossless compression.
V P X CC M PT Sequence Number Timestamp Synchronization Source (SSRC) Identifier Contributing Source (CSRC) Identifiers ... Profile-specific Extensions RTP/UDP/IP Header Compression (3) • In most RTP headers, only the sequence number and timestamp change from packet to packet. • If packets are not lost and they arrive in correct order, the sequence number is incremented by one for each packet. • For VoIP packets of constant duration, the timestamp is incremented by the number of sample periods conveyed in each packet. • One bit in the compressed header is reserved for the marker bit. • If treated as a constant field, the compression would become inefficient.
Some Terminal Requirements and Implementations • All terminals that support real time voice must have considerable processing capacity. • The computational requirements of voice codecs increase with the compression ratio. • Microsoft NetMeeting is popular video conferencing tool. • Pentium 90 processor or higher,24 MB of RAM. • VocalTec Internet Phone Lite is mainly targeted for pure voice connections. • Pentium 75 processor or higher.
Conclusions • RTP/RTCP protocol suite provides the means for sending packetized voice by introducing timestamps and sequence numbers. • Playout buffering is needed to re-synchronize the received voice stream. • RTP/UDP/IP overhead problem can be solved by efficient header compression. • Terminals that support real time interactive voice must have considerable processing power. The computational requirements of the voice codecs typically increase with the voice compression ratio.