320 likes | 432 Views
A Survey of Packet Loss Recovery Techniques for Streaming Audio. C. Perkins, O. Hodson, V. Hardman University College London IEEE 1998 Presented by Jason Hester CS598kn 6 April 2005. Outline. Introduction Multicast Overview Sender-Based Repair Receiver-Based Repair Comparisons
E N D
A Survey of Packet Loss Recovery Techniques for Streaming Audio C. Perkins, O. Hodson, V. Hardman University College London IEEE 1998 Presented by Jason Hester CS598kn 6 April 2005
Outline • Introduction • Multicast Overview • Sender-Based Repair • Receiver-Based Repair • Comparisons • Author Recommendations
Introduction • Creation of IP multicast and Mbone • New class of audio/video conferencing applications • Efficient multi-way communications • Scales from 2 to several thousand • Advantage: • Scalability • Disadvantages • Unusual channel characteristics • Challenge to achieve robustness
Multicast Overview • Intent: scalable and efficient method for datagram transport to multiple users. • Level of indirection between senders and receivers • Send to group address • Listen on group address • Sender does not know set of receivers
Multicast Overview: Indirection • Routing decisions: local • Recovery decisions: local • No need to pass decisions to anyone • Scalability and robustness increased
Multicast Overview: Mbone • Mbone • Portion of internet supporting M-cast • Some dedicated links • More shared links with other traffic • Defined by presence of M-cast routing support • Many attempts to characterize Mbone loss patterns • Varied results • Broad conclusion: some receivers in a conference WILL experience packet loss
Multicast Overview: Mbone (cont.) High Loss Most observe loss of 2-5% Low Loss Observed loss rates in a large multicast conference. Most loss due to congestion at routers -> correlation between bandwidth used and amount of loss experienced.
Multicast Overview: Jitter • Typical Mcast channel • High latency • High jitter • Jitter Impact • Real-time • Delayed packets may be discarded • Problem compounded with interaction • Possible large playout delay may compensate • Trade-off • Quality vs. Interactivity • Local choice • Single session, multiple types of participant Observed variation in end-to-end delay as seen by an Mbone audio tool (20 ms timing quantization). Packets sent from U. of Oregon to U. College London.
Packet Loss Recovery: Categorization Packet Loss Recovery • We use both • Sender-based repairs most losses • Receiver-based cleans small gaps Focus on streaming audio only!
Sender-Based Repair • Definition: unit vs. packet • Unit: interval of audio data • Packet: one or more units, encapsulated for transmission
Sender-Based Repair: FEC • Forward Error Correction: add repair data to stream • Media-independent: independent of contents of the stream • Media-specific: use knowledge of the stream
FEC: Media Independent • Large number of similar schemes. • “ith” bit in check packet: “ith” bit in each associated data packet • XOR applied across data packets to create parity packet • 1 parity packet per n-1 data packets • Loss recoverable if only 1 per n packets Repair using parity FEC
FEC: Media Independent • Pros • Media independent • Repair is exact replacement • Derivation of correction packet is simple • Cons • Increased delay • Increased bandwidth • Difficult decoder implementation
FEC: Media Specific • Send each unit of audio in multiple packets • Primary vs. secondary encoding • Difficult choice of encodings (bandwidth vs. complexity) • If primary is complex, good quality, low bandwidth, then secondary can be the same Repair using media-specific FEC
FEC: Media Specific (cont.) • Overhead is variable • More overhead = more quality on repair, but same number of repairs (as opposed to media independent) • Pros • Low latency (suitable for interactive apps) • Cons • Repair is not exact
FEC and Congestion Control • FEC is effective, but… • Addition of much repair data will increase congestion and packet loss • Especially large m-cast groups • Varied loss rates among receivers • Solutions: layered encoding of data • Different rates, multiple groups • Receivers join and leave groups in response to congestion • Different QoS based on groups joined
Sender-Based Repair: Interleaving Interleaving units across multiple packets. • Disperses effects of packet loss • Multiple small gaps vs. large gap
Sender-Based Repair: Interleaving (cont.) • Useful if: • Unit size < packet size • End-to-end delay not important • Mbone audio tools • Packets similar in length to phonemes • Loss of packet is entire phoneme • Now loss is small gap within phoneme • Loss can be mentally “patched over” • Pros • No increase in bandwidth • Cons • Increased latency • Bad for interactive apps
Sender-Based Repair: Retransmission • Latency bounds must be considered • Interactive: end-to-end must be < 250 ms, no retrans • Not interactive: more flexible, perhaps retrans is option • Scalable Reliable Multicast (SRM) • Member detects loss • Waits random time, determined by distance from source • Multicast repair request • Mutliple hosts may miss same packet • Retrans timer set so host closest to failure times out 1st • Other hosts, suppress their request when see that of 1st • Any host with data can reply to retrans request • Again, timer prevents reply implosion • 1 request, 1 retrans
Sender-Based Repair: Retransmission (cont.) • SRM Not suitable for streaming media • No bound on transmission delay • May take an arbitrary amount of time • Retransmission in general may work • Bound # of retrans requests per unit of data • Works when loss rate low • When loss rate increases, crossover point with overhead, use FEC instead • Large Mbone session observation • Most packets lost by at least 1 host • Every packet multicast at least twice!
Receiver-Based Repair (Error Concealment) Derive decoder state • Goal: produce replacement for loss packet similar to original • Works because audio signals (esp. speech) have short term similarity. • Best for small loss rates (< 15%) and small packets (4-40ms) • Loss closer to phoneme = worse result
Receiver-Based Repair: Insertion • Insertion: no characteristic of signal used to aid reconstruction • Splicing: no gap left, but timing disrupted • Poor results, but also… • Disrupts adaptive playout buffer • Bad choice • Silence Substitution: silence in gap, maintain timing • Effective for short packets (<4ms) and low loss (<2%) • Very simple, widespread use, despite poor performance
Receiver-Based Repair: Insertion (cont.) • Noise Substitution: background noise in gap, maintain timing • Aids in human phonemic restoration • Subjectively better quality sound • Sender sends “comfort noise” indicator packets? • Repetition: replace lost unit with copy of previous unit • Low complexity, pretty good performance • Fading the repetition enhances subjective performance • Good compromise between simplicity of insertion and complexity of interpolation
Receiver-Based Repair: Interpolation • Interpolation from surrounding packets • Characteristics of the signal are considered • Waveform Substitution • Uses audio before and (optionally) after loss • Uses templates to locate suitable pitch patterns on either side • Performance better than silence substitution and repetition
Receiver-Based Repair: Interpolation (cont.) • Pitch Waveform Replication • Uses pitch detection algorithm either side of loss • Unvoiced loss: packet repetition • Voiced loss: repeat waveform of appropriate pitch length • Slightly better than waveform substitution • Time Scale Modification • Audio either side of loss “stretched” over gap • Find overlapping vectors of pitch cycles either side • Offset them to cover loss, average where they overlap • Complex, but best performing interpolation method
Receiver-Based Repair: Regeneration • Use knowledge of audio compression algorithm • Derive codec parameters: lost audio can be synthesized • Computationally complex • Interpolation of Transmitted State • If codec based on transform coding or linear prediction, decoder can interpolate between states • Pro: no boundary effects due to changing codec • Con: high processing demand
Receiver-Based Repair: Regeneration (cont.) • Model-Based Recovery • Uses speech on one or both sides of loss • Model created to generate speech over loss • Best with short losses (interleaving) • Small gap means high probability of sides being relevant • Most computationally complex scheme
Error Concealment Comparison • Measurements difficult, subjective in nature. • Complexity increase does not match quality increase • Repetition with fading good compromise
Recommendations: Non-interactive Apps • One-to-many (radio broadcast style) • Quality more important than latency • Bandwidth efficiency important (varying degree of bandwidth to users) • Interleaving with repetition w/fading • Retransmission • Not acceptable for multicast • Heterogeneous receiver set • Many request for retrans, too much overhead • Acceptable for unicast • FEC • Better than retrans • Single FEC packet can correct multiple losses w/no control traffic overhead
Recommendations: Interactive Apps • IP Telephony • Must minimize end-to-end delay • Sacrifice some quality • Unacceptable • Interleaving, retransmission, media-independent FEC • Acceptable: media dependant FEC • Repair approximate, but satisfactory • Combine with error concealment
Paper Critique • Good thorough overview for survey • Well organized • Lacks details on complex cases, references are included