1 / 122

Transmit Power Optimization in OFDM Systems

cricket
Download Presentation

Transmit Power Optimization in OFDM Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Transmit Power Optimization in OFDM Systems Brian S. Krongold

    2. Outline Introduction & Motivation Part 1: Peak-to-Average Power Ratio (PAPR) Part 2: OFDM Transmit Power Allocation

    3. Typical OFDM Transceiver

    4. Why does the PAPR Problem Occur? Time-domain samples are linear combinations of random variables. If N is large, a central limit theorem effect begins. The time-domain samples become approximately Gaussian distributed, and the tails are our “occasional large peaks”. HPA’s essential consume power in relation to their peak power, and not the average power. Creates a HUGE power cost for base stations if an expensive HPA with a large dynamic range is used. NOTICE: This is an ANALOG problem. Viewing PAR results of the digital OFDM signal are not truly indicative of the analog PAR.

    5. Peak-to-Average Power Ratio (PAPR) OFDM suffers from high PAPR defined as, Maximum PAPR of an N sub-carrier OFDM signal

    6. PAPR distribution

    7. PAPR Distribution PAPR increases with number of subchannels N PAPR doesn’t depend on the modulation scheme

    8. PAPR Problems Power is one of the important topics in practice Low power consumption - Prolong the battery life, reduce the heat dissipation in the mobile device Power management High PAPR makes OFDM a very poor performer with respect to power It demands the HPA with large backoff It demands the HPA with better efficiency It requires the ADC with large dynamic range It requires the LO with low phase noise level

    9. PAPR Problems OFDM Requires Linear Amplifiers and high dynamic range in the transmitter hardware. Expensive Inefficient - Consumes more power If non-linear amplifiers are used Spectral spread (Out of band radiation) In-band distortion

    10. So How Can I Fix (Alleviate) It????? Many methods to reduce the PAR are desired and have been proposed. Each has their own advantages and drawbacks. Certain properties are desirable No loss in data rate or side information Distortionless (does not malign the data) Low complexity The Tradeoffs of any PAPR reduction method Complexity Distortion introduced Reduced Bandwidth/Data Rate Increased Average Power

    11. Popular PAPR Reduction Methods Clipping and filtering and non-linear distortion In-band distortion is mostly negligible. But out of band distortion is more serious Multiple signal representation Partial transmit signalling: divide/group into clusters and each of them is done with a smaller IFFT. [Muller and Hubber, 97] Selected mapping: It is based on selecting one of the transformed blocks for each data block, which has the lowest PAPR [Nauml, Fisher and Hubber, 96] Interleaving: Data symbols or modulated symbols are interleaved [Jayalath and Tellambura ‘02]

    12. Popular PAPR Reduction Methods Constellation optimization Tone Reservation: inserting anti-peak signals in unused or reserved subcarriers. The objective is to find the time-domain signal to be added into the original time-domain signal such that PAPR is reduced. [Tellado 00] [Krongold and Jones, 03] Tone injection: The basic idea is to increase the constellation size so that each of the points in the original basic constellation can be mapped into several equivalent points in the expanded constellation. [Tellado 00] Active constellation extension: Similar to TI [Krongold and Jones, 03]

    13. Popular PAPR Reduction Methods Coding The idea is to select a codeword with less PAPR. It still is an open problem to construct codes with both low PAPR and short Hamming distance Golay and Reed-Muller codes have shown PAPR reduction properties, but at a significant rate reduction. Receiver-side clipping noise compensation [Tellado]

    14. Clipping and Filtering [Armstrong] Simple approach requiring multiple sequential FFTs. Using oversampled time-domain signal, clip it digitally, then filter out-of-band leakage. Filtering can cause “regrowth” of peaks, so repeat process multiple times. Live with in-band clipping noise and effect on performance. Is some clipping OK? If so, how much?

    15. Multiple Signal Representation Side information

    16. Interleaving - (Data/Symbol permutation) Other Popular MSR Techniques Selected mapping Partial transmit sequences

    17. Selected Mapping Define U distinct fixed phase vectors of type of length N. Data vector is multiplied with U vectors resulting U, statistically independent alternative OFDM symbols. Each symbol is transformed into time domain by taking IDFT. The symbol with lowest PAR is selected for the transmission. No analytical method is present to find proper set of vectors

    18. Partial Transmit Sequences

    19. Coding Example BPSK, N=4 PAPR of all possible sequences [Jones et. al 94]

    20. Tone Injection [Tellado 99] Increase the constellation size so that each of the points in the original basic constellation can be mapped into several equivalent points in the expanded constellation

    21. Tone Reservation Utilize subchannels (tones) that do not send data in order to make a peak-cancelling signal, thus reducing the PAR and not distorting the data symbols.

    22. Active Constellation Extension Make constellations more flexible: Non-bijective Constellations There are many or infinite points which can be used to transmit Find a good or best representation with PAR as the cost function. Allowable Extensions do NOT change ML decision regions Extensions cannot change minimum distance properties Generally, this means only outside constellation points can be moved Very simple for the case of QAM constellations

    23. Active Constellation Extension Key idea: move constellation points, but Don’t change receiver decision boundaries Maintain or increase margin

    24. ACE Concept: QPSK

    25. ACE Concept: 16-QAM

    26. ACE Algorithm: POCS

    27. ACE Algorithm: POCS Approach

    28. ACE Algorithm: Theory ACE constraints, clips are all convex ACE algorithm = POCS (Projection onto Convex Sets) Guaranteed to converge toward solution if one exists Small clip is a gradient descent in terms of peak reduction ACE algorithm is approximate gradient-project descent

    29. “Smart” Gradient-Project Enforce constraints on clip signal in FFT domain to guarantee valid extension; IFFT to get cclip(n) Perform constrained gradient update x(i+1)(n) = x(i)(n) + ?cclip(i)(n) Select step-size u to minimize new peak(s) Essentially converges after 2-3 iterations

    30. ACE Gradient Project Approach

    31. What Should the Gradient Stepsize Be? Can use a preselected stepsize, but convergence will be slower Can determine a stepsize for each ACE application Signals are complex, so it may be difficult to determine an optimal stepsize that minimizes the PAR at each level. Solution: Linearize the optimal stepsize equation with two safe, simple, and intuitive assumptions valid while the PAR has not been reduced a lot already. Assumption breaks down after about four ACE iterations, but most gains are achieved within the first two or three iterations!!

    32. Smart Gradient Step-Size Calculation

    33. Example Signal: 11.63 dB PAR

    34. Signal After ACE: 7.42 dB PAR

    35. Extended OFDM Symbol Distribution

    36. Achieving a Target PAR

    37. Comparison of PAR Reduction

    38. Complexity.. Complexity..Complexity..! Multiple signal representation increases the transmitter complexity significantly Tone reservation, tone injection, and ACE require moderate/high computations. What are the low complexity schemes? Clipping and Filtering Clipping +Filtering + FEC Pre-distortion

    39. PAPR Method Extensions to MIMO M transmit antennas Generally, the extension is simple for any PAPR reduction method. Consider the PAPR of M time-domain signals instead of just 1. (Complexity Increase!) We’ll use ACE as an example

    40. ACE for MIMO-OFDM ACE algorithm easily extended to MIMO OFDM systems For diversity and V-BLAST systems, can apply ACE independently to each antenna For general space-frequency system apply ACE constraints to each space-frequency eigenchannel Perform gradient project via joint clipping across all antennas, times

    41. ACE for MIMO OFDM

    42. Simulation Example

    43. PAPR Reduction in OFDMA Downlink OFDMA Although the base station is sending information for many users, it is still one signal. Can still utilize existing techniques, but any side information must be reliable for all users. Uplink OFDMA (was considered by 3GPP) More difficult problem due to near/far effect and possible/likely being asynchronous. Tone reservation is one possible solution.

    44. Uplink OFDMA Model Each user has a group of subchannels for data transmission. Could be grouped into 1 or more bands for data transmission. Each band may have a small number of subchannels. Each user occupies only a small fraction of the uplink bandwidth.

    45. Uplink OFDMA Tone Reservation Consider a single user The non-data tones can be used as reserved tones to help reduce the PAR at the mobile transmitter. Problem: these subchannels are used by other users in the uplink. Result: multiple access interference (MAI). Limit MAI with a PSD mask for PAPR reduction signal

    46. Tone Reservation Comparison Originally developed for the DSL case Large # of data tones, smaller # of reserved tones Large amount of energy needed in each reserved tone to obtain good PAR reduction. UL-OFDMA Case: Exact opposite of DSL case! Small # data tones, very large # of reserved tones Small amount of energy needed in each reserved tone to get good PAR reduction.

    47. Example: 16-tone band, N = 512

    48. Proposed PSD Mask

    50. Part 2: Transmit Power Allocation What is resource allocation? What are our resources? Power and data rate allocated to each subchannel Subchannels are a resource when dealing with multiuser OFDM systems. (users allocated particular subchannels) Resource allocation: mathematical problem of optimizing some benefit from the resources available. Optimization problem: objective function (benefit) with constraints (finite resources and performance criteria)

    51. Channel-State Information If the receiver can estimate channel gains/SNRs or channel statistics, the information can be sent back to the transmitter. The transmitter can optimize performance given this additional information. Good subchannels get more power and/or rate, bad subchannels get less power and/or rate. Must maintain certain performance on each subchannel

    52. Important Practical Issues How reliable is the CSI? How often does CSI need to be fed back. How fast does the channel change? Can latency be overcome? Feedback takes up bandwidth. Quantized feedback is more practical. How to do it optimally? Vector quantization? Key Point: knowing that one subchannel is better than another allows the system to gain an advantage.

    53. Resource Allocation Examples Single user systems: Send the most amount of information across channel with a total power constraint. Use the least amount of power to send a fixed rate. Single-user problems are easier and usually “nicer” optimization problems (convexity). Multiuser systems: In addition to the above, a guarantee that each user is given a minimum total rate. Multiuser problems can be non-convex, but can be solved efficiently with convex approximations (which are very good!) Can be difficult in certain situations, such as DSL with interference from crosstalk.

    54. Motivation: Water-Pouring Information theory result for Gaussian channels. Achieve capacity for a given total power constraint. Consider the channel-to-noise ratio (CNR) to be Solution: “Pour water” onto the inverted channel until the power budget (water) is exhausted. Higher CNR portions get more water (power). Lower CNR get little or even no water (power)

    55. Water-Pouring Example

    56. Water-Pouring Example

    57. Water-Pouring Notes Same concept can be applied to a discrete set of parallel channels (our OFDM case!) Problem: water-pouring gives the optimal power allocation for capacity, but does not tell you what the codes needed are. Optimal power allocation is similar, but not exactly the same as a problem with BER constraints. Rates in each channel may have infinite granularity (as opposed to nice integer numbers).

    58. Single User Point-to-Point Problem Maximize system performance: Example: Send the most amount of information across channel. Example: Use the least amount of power to send a given data rate. Real-time system Computational complexity is important. Depends how often we need to re-optimize the resource allocation. System Resources Power and rate in each subchannel. Possible constraints Maximum total allocated power (? of subchannel powers) Target total rate constraint QoS constraints: Bit/Symbol error probability BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    59. DMT Loading Example

    60. Our Goal Would like to obtain truly optimal solutions with minimal complexity, and therefore avoid: Rounding (many ad hoc approaches have been published) SNR Gap Approximation (infinite QAM approximation) Costly re-computing or sorting subchannel gains (O(N logN) ) Suboptimal solutions can be very good, but if an optimal solution is efficient to obtain, then that’s great! BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    61. Optimization Problems Rate Maximization Maximize the total rate subject to a power budget. Margin Maximization Minimize the total power to meet a target total rate. BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    62. Input Parameters (Sub)Channel-to-noise ratio: Physical parameters from the environment. is the channel power gain in the ith subchannel is the AWGN noise power in the ith subchannel Estimated at the receiver, fed back to the transmitter. Transmitter solves the optimization problem. Signal-to-noise ratio: BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    63. Rate-Power Functional Relationship Assume rate is a concave function of power. Scalability: key property resulting in significant computational reduction. The subchannel power-rate operating curves are scaled versions of one another. BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    64. Unconstrained Optimization Problem Margin Maximization Goal: Find the ? whose minimum Lagrange cost meets the Rtarget constraint. A basic approach is: Given a ?, solve for the minimum Lagrange cost, compute the total rate it obtains, and update \lambda based upon this result. We’re interested in problems with a discrete set of allowable rates (in bits/symbol). We then have a discrete, separable convex resource allocation problem (DSCRAP). BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    65. Composite Power-Rate Curve

    66. Unconstrained Optimization Problem Rate Maximization Goal: Find the ? whose minimum Lagrange cost meets the power budget as best as possible. Given a ?, solve for the minimum Lagrange cost, compute the total rate it obtains, and update lambda based upon this result. We’re interested in problems with a discrete set of allowable rates (in bits/symbol). We then have a discrete, separable convex resource allocation problem (DSCRAP). BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    67. Dual Solution Find rates and powers such that for all subchannels, Then iterate to find optimal lambda

    68. Subchannel Power-Rate Curve

    69. Observations The optimal operating point for each subchannel is where a line of slope is tangent to the rate-power curve. Result is same-slope solution for each user. In the discrete case, derivatives do not exist! Derivatives are generalized to differentials to obtain the operating points.

    70. Discrete Subchannel Allocation

    71. Non-Uniqueness of Allocation

    72. Allocation Boundaries (Lookup Table)

    73. Computational Complexity Requires subchannel to have all possible powers and its own individually computed lookup table requiring O(ND) computations. When channel conditions change, an entirely new set of powers and lookup-tables need to be re-computed. We can do better than this.

    74. Finding Optimal Lambda Finding an optimal ?* is achieved with a bisection method. Given an upper and lower ?, and choose a test a ? value between them (then update boundary). A slower converging bisection method actually requires less overall computation!!! Obtain new ? by averaging the upper and lower ? BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    75. Bisection Method for Optimal ?* Start initially with and such that Update the Lagrange multiplier: If , then replace the “low” allocation If , then replace the “high” allocation

    76. Bisection Method

    77. Bisection Properties Guaranteed improvement at each iteration Fast convergence Even with bad initial “guesses”, typically requires about log2(DN) iterations. Easily known when the optimal dual solution has been reached. Iteration gives no improvement. Offers significant computational advantage in terms of subchannel convergence.

    78. Example: CNR

    79. Example: Low ? Power vs. Iteration

    80. Example: Optimal Rate Allocated

    81. Example: Final Power Allocated

    82. Extensions to Difficult Problems Multi-User Total power constraint or target rate constraint MIMO Total power constraint (per antenna?) Fading channels Outage constraints? Limited Feedback For practical use. Given codebook, how to choose best feedback? How to choose the best codebook?

    83. Fading Channels Can possibly model as a composite fading channel (slow and fast fading, which are uncorrelated) Depending on nature of slow fading, could have subchannel variation in average SNR. Simple extension to this scenario.

    84. Fading Phenomena

    85. Slow fading tracked/predicted, equalised Fast fading produces stochastic SNR Average error probability The effect of the PDF (e.g. Raleigh fading) can be incorporated. Composite Fading

    86. MIMO-OFDM Loading Each Tx antenna has its own power amplifier (PA). Problem: Output is limited by the PA’s backoff level If the average power is too high, nonlinear PAPR effects (inherent in OFDM) may exceed a tolerance level. i.e, cannot have disproportionately large antenna power(s) Result: Power-constrained antennas in the optimization problem. Practical constraint that helps ensure clean transmission

    87. Notation Denote a spatial subchannel pair by (l,n) Spatial domain powers/rates denoted with ~

    88. New Rate-Max Optimization Problem Per-antenna power constraints Objective function in spatial domain while power constraints in antenna domain.

    89. Is the New Problem Convex???? YES!! To show how, we rewrite the problem in terms of only spatial powers and rates. The nth spatio-subchannel power vector is is the power gain transformation matrix whose entries can easily be shown to be: BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    90. Rewritten New Optimization Problem Spatial powers are a convex function of spatial rates. Convex constraints due to non-negative weights

    91. Unconstrained Optimization Problem For a given , easy to determine rates/powers. Need to find an optimal solving original constraints. BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    92. Minimizing the Lagrange Constraints Basic loading algorithm is (1) Choose a (2) Evaluate it in spatial domain (3) Check total powers in antenna domain (4) Update and reiterate

    93. Approaches to Find Optimal _ Difficulty: multi-dimensional Lagrange search Ellipsoid method can be used to find optimal solution. Can be a costly approach with large LT A simple branch-and-bound can be used to get very close to the optimal solution with LT = 2 Don’t expect many transmit antennas, so this can keep things practical. BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD. ASSUMPTION keeps things simple, but the proposed can work without it as well.

    94. Multiuser OFDM Power Allocation The number of free variables is now K times larger. Is the problem more difficult? Slightly with a single total power or rate constraint Much more so with individual power or rate constraints (multidimensional lagrange space) Let’s start simple

    95. WSRmax Discrete Optimization Problem

    96. Optimization Problem Notes Non-convex in general due to exclusive subchannel constraint Let’s instead solve a convex approximation! Implicitly invoking the QoS constraint Power for a given Rate is fixed by the QoS constraint and CNR. Have written , but left off QoS and CNR for notational simplicity. Thus, considering only valid pairs of We refer to these pairs as Operating Points

    97. Initial Works Problem was solved using the dual decomposition in the continuous case by Seong, Mosheni, & Cioffi (2006) Used capacity instead of rate which made computing the solution easier. An ergodic WSRmax approach was proposed in Wong and Evans (2007) for the continuous and discrete case. Power is limited on average, not instantaneously Shows improvement only at low SNRs Somewhat expensive/complicated approach. We are working on a simpler way to approach this problem.

    98. Discrete Approach Very practical as it deals with a discrete set of rates. No suboptimal rounding from continuous solutions Very computational efficient: fully exploits the structure of the optimization problem. No sorting of subchannel gains required. Guaranteed optimality (up to a very small worst case duality gap). Virtually all other approaches use SNR-gap and other approximations (including rounding of continuous solution). Flexibility: Can incorporate fading, multiple services and QoS constaints

    99. Lagrangian and Dual Problem

    100. Dual Problem Search to find the optimal Lagrange multiplier. For any given test , must maximize the Lagrangian to obtain Can compare the total power to the constraint to see how close we are.

    101. Computing g(?) Maximizing the Lagrangian is an unconstrained optimization problem. Subchannels do not affect one another and become separable (optimize them individually) Hence, maximize the Lagrangian contribution from each subchannel, then sum them. Efficient approach is detailed in the paper.

    102. Simplifying g(?)

    103. Solving for g(?)

    104. Observations Essentially, this is computing K single-user OFDM bit loadings, followed by choosing the best user on each subchannel. This may need to be repeated many times to reach an optimal ?* that minimizes g(?) Would like an efficient approach that minimizes computation and searching.

    105. Min Power Allocation Problem Minimum user rate constraints Creates a complicated optimization problem. Solving for dual solution is difficult as it requires searching a K-dimensional Lagrange search The ellipsoid method can be used, but is very slow to converge.

    106. Instantaneous Minimum Sum-Power Problem

    107. Lagrangian and Dual Problem

    108. “Simple” Simulation OFDMA system with 8 users, 128 subchannels Channels for each user generated independently from 8-tap Rayleigh distribution with exponentially-decaying power profile. Discrete rates available {0,1,2,3,4,5,6} bits/symbol. QoS constraint: 10-3 symbol error prob Rate constraint: 48 bits/symbol for each user

    109. Convergence of Ellipsoid Method O(NK) complexity per iteration Since instantaneous, must be redone whenever the channels change resulting in massive complexity.

    110. Motivation: Ergodic Loading Previous I discussed the instantaneous loading problem Optimize with respect current channel conditions. Utilize all resources at given time. Problem: Time-Varying Channels Constantly need to re-optimize as the instantaneous channel gains change over time. May be impractical depending upon mobility. Alternative: Ergodic Loading Problem is relaxed by replacing instantaneous powers and rates with ergodic ones. Satisfy rate or power constraint on the average. Maximize the average rate in Rate-Max Problem.

    111. Ergodic Rate-Max Problem QoS constraints are enforced on a symbol-by-symbol basis Discrete-set of rates available as opposed to continuous loading. Ergodic approach exploits temporal dimension E.g., don’t waste power when the channel is poor. E.g., use a lot of power when the channel is very good.

    112. Dual Problem and Solution (RateMax) Lagrangian Dual Problem

    113. Ergodic Optimal Solution Key Point: Characterized by an optimal ?* Solution valid as long as channel statistics remain the same. Much slower time-scale than instantaneous channel Optimization complexity is lower At each time-instant, ?* is used to do an easy lookup table loading based on instantaneous channel gains See Krongold (2000) for more details. Effectiveness depends on channel coherence time and latency in obtaining CSI.

    114. Previous Work Wong and Evans (2006) proposed ergodic discrete loading for weight-sum-rate multiuser OFDM loading. Similar to single-user link in that the Lagrange search is one dimensional. Efficient approach based on order statistics to reduce multi-user computational load. Easily altered to single-user point-to-point case. Batch-based approach.

    115. Adaptive Ergodic Loading Estimate ergodic total rate and power using empirical averaging. Move from a batch-mode approach to an online-adaptive approach. Update ? to converge to the optimal ?* and/or track the underlying channel statistics

    116. Algorithm Lagrange update: How to update ? Previous powers based on previous ? so we might mix powers from different ? ‘s Can re-calibrate using the previous M symbols at each ? update. Expensive! Will use for comparison purposes.

    117. Parameters ? : controls the averaging window. Will not reach the truly optimal ergodic solution unless we average over all time. ? : controls ? convergence They do in fact effect each other.

    118. Simulations Rate-Max optimization problem, N=64 subchannels Channel: 8-tap Rayleigh with exponentially-decaying power profile Pmax is such that avg. received SNR = 12.35 dB. Each iteration (symbol) has an independent channel realization. Discrete rates: {0,1,2,3,4,5,6} bits/symbol. QoS: 10-3 symbol error rate. Optimal ergodic rate is 180.67 bits/symbol, and is 179.83 in the instantaneous case.

    119. Average Power vs. Iteration #

    120. Average Rate vs. Iteration #

    121. 1/? vs. Iteration #

    122. Thoughts and Conclusions Simple online adaptive approach for OFDM loading Computational complexity spread evenly over time and less complex than instantaneous loading. Adaptation and calibration memory are adjustable to balance convergence/tracking with computational complexity. Lambda is less sensitive in the ergodic case then the instantaneous case. Relatively easy to be very close to the optimal solution Easily extendable to the multi-user case with a weighted-sum-rate maximization.

More Related