E N D
1. Transmit Power Optimization in OFDM Systems Brian S. Krongold
2. Outline Introduction & Motivation
Part 1: Peak-to-Average Power Ratio (PAPR)
Part 2: OFDM Transmit Power Allocation
3. Typical OFDM Transceiver
4. Why does the PAPR Problem Occur? Time-domain samples are linear combinations of random variables. If N is large, a central limit theorem effect begins.
The time-domain samples become approximately Gaussian distributed, and the tails are our “occasional large peaks”.
HPA’s essential consume power in relation to their peak power, and not the average power.
Creates a HUGE power cost for base stations if an expensive HPA with a large dynamic range is used.
NOTICE: This is an ANALOG problem. Viewing PAR results of the digital OFDM signal are not truly indicative of the analog PAR.
5. Peak-to-Average Power Ratio (PAPR) OFDM suffers from high PAPR defined as,
Maximum PAPR of an N sub-carrier OFDM signal
6. PAPR distribution
7. PAPR Distribution PAPR increases with number of subchannels N
PAPR doesn’t depend on the modulation scheme
8. PAPR Problems Power is one of the important topics in practice
Low power consumption - Prolong the battery life, reduce the heat dissipation in the mobile device
Power management
High PAPR makes OFDM a very poor performer with respect to power
It demands the HPA with large backoff
It demands the HPA with better efficiency
It requires the ADC with large dynamic range
It requires the LO with low phase noise level
9. PAPR Problems OFDM Requires
Linear Amplifiers and high dynamic range in the transmitter hardware.
Expensive
Inefficient - Consumes more power
If non-linear amplifiers are used
Spectral spread (Out of band radiation)
In-band distortion
10. So How Can I Fix (Alleviate) It????? Many methods to reduce the PAR are desired and have been proposed. Each has their own advantages and drawbacks.
Certain properties are desirable
No loss in data rate or side information
Distortionless (does not malign the data)
Low complexity
The Tradeoffs of any PAPR reduction method
Complexity
Distortion introduced
Reduced Bandwidth/Data Rate
Increased Average Power
11. Popular PAPR Reduction Methods Clipping and filtering and non-linear distortion
In-band distortion is mostly negligible. But out of band distortion is more serious
Multiple signal representation
Partial transmit signalling: divide/group into clusters and each of them is done with a smaller IFFT. [Muller and Hubber, 97]
Selected mapping: It is based on selecting one of the transformed blocks for each data block, which has the lowest PAPR [Nauml, Fisher and Hubber, 96]
Interleaving: Data symbols or modulated symbols are interleaved [Jayalath and Tellambura ‘02]
12. Popular PAPR Reduction Methods Constellation optimization
Tone Reservation: inserting anti-peak signals in unused or reserved subcarriers. The objective is to find the time-domain signal to be added into the original time-domain signal such that PAPR is reduced.[Tellado 00] [Krongold and Jones, 03]
Tone injection: The basic idea is to increase the constellation size so that each of the points in the original basic constellation can be mapped into several equivalent points in the expanded constellation. [Tellado 00]
Active constellation extension: Similar to TI [Krongold and Jones, 03]
13. Popular PAPR Reduction Methods Coding
The idea is to select a codeword with less PAPR. It still is an open problem to construct codes with both low PAPR and short Hamming distance
Golay and Reed-Muller codes have shown PAPR reduction properties, but at a significant rate reduction.
Receiver-side clipping noise compensation [Tellado]
14. Clipping and Filtering [Armstrong] Simple approach requiring multiple sequential FFTs.
Using oversampled time-domain signal, clip it digitally, then filter out-of-band leakage.
Filtering can cause “regrowth” of peaks, so repeat process multiple times.
Live with in-band clipping noise and effect on performance.
Is some clipping OK? If so, how much?
15. Multiple Signal Representation
Side information
16. Interleaving - (Data/Symbol permutation) Other Popular MSR Techniques
Selected mapping
Partial transmit sequences
17. Selected Mapping Define U distinct fixed phase vectors of type of length N.
Data vector is multiplied with U vectors resulting U, statistically independent alternative OFDM symbols.
Each symbol is transformed into time domain by taking IDFT.
The symbol with lowest PAR is selected for the transmission.
No analytical method is present to find proper set of vectors
18. Partial Transmit Sequences
19. Coding Example BPSK, N=4
PAPR of all possible sequences [Jones et. al 94]
20. Tone Injection [Tellado 99] Increase the constellation size so that each of the points in the original basic constellation can be mapped into several equivalent points in the expanded constellation
21. Tone Reservation Utilize subchannels (tones) that do not send data in order to make a peak-cancelling signal, thus reducing the PAR and not distorting the data symbols.
22. Active Constellation Extension Make constellations more flexible: Non-bijective Constellations
There are many or infinite points which can be used to transmit
Find a good or best representation with PAR as the cost function.
Allowable Extensions do NOT change ML decision regions
Extensions cannot change minimum distance properties
Generally, this means only outside constellation points can be moved
Very simple for the case of QAM constellations
23. Active Constellation Extension Key idea: move constellation points, but
Don’t change receiver decision boundaries
Maintain or increase margin
24. ACE Concept: QPSK
25. ACE Concept: 16-QAM
26. ACE Algorithm: POCS
27. ACE Algorithm: POCS Approach
28. ACE Algorithm: Theory ACE constraints, clips are all convex
ACE algorithm = POCS (Projection onto Convex Sets)
Guaranteed to converge toward solution if one exists
Small clip is a gradient descent in terms of peak reduction
ACE algorithm is approximate gradient-project descent
29. “Smart” Gradient-Project Enforce constraints on clip signal in FFT domain to guarantee valid extension; IFFT to get cclip(n)
Perform constrained gradient update
x(i+1)(n) = x(i)(n) + ?cclip(i)(n)
Select step-size u to minimize new peak(s)
Essentially converges after 2-3 iterations
30. ACE Gradient Project Approach
31. What Should the Gradient Stepsize Be? Can use a preselected stepsize, but convergence will be slower
Can determine a stepsize for each ACE application
Signals are complex, so it may be difficult to determine an optimal stepsize that minimizes the PAR at each level.
Solution: Linearize the optimal stepsize equation with two safe, simple, and intuitive assumptions
valid while the PAR has not been reduced a lot already.
Assumption breaks down after about four ACE iterations, but most gains are achieved within the first two or three iterations!!
32. Smart Gradient Step-Size Calculation
33. Example Signal: 11.63 dB PAR
34. Signal After ACE: 7.42 dB PAR
35. Extended OFDM Symbol Distribution
36. Achieving a Target PAR
37. Comparison of PAR Reduction
38. Complexity.. Complexity..Complexity..! Multiple signal representation increases the transmitter complexity significantly
Tone reservation, tone injection, and ACE require moderate/high computations.
What are the low complexity schemes?
Clipping and Filtering
Clipping +Filtering + FEC
Pre-distortion
39. PAPR Method Extensions to MIMO M transmit antennas
Generally, the extension is simple for any PAPR reduction method.
Consider the PAPR of M time-domain signals instead of just 1. (Complexity Increase!)
We’ll use ACE as an example
40. ACE for MIMO-OFDM ACE algorithm easily extended to MIMO OFDM systems
For diversity and V-BLAST systems, can apply ACE independently to each antenna
For general space-frequency system
apply ACE constraints to each space-frequency eigenchannel
Perform gradient project via joint clipping across all antennas, times
41. ACE for MIMO OFDM
42. Simulation Example
43. PAPR Reduction in OFDMA Downlink OFDMA
Although the base station is sending information for many users, it is still one signal.
Can still utilize existing techniques, but any side information must be reliable for all users.
Uplink OFDMA (was considered by 3GPP)
More difficult problem due to near/far effect and possible/likely being asynchronous.
Tone reservation is one possible solution.
44. Uplink OFDMA Model Each user has a group of subchannels for data transmission.
Could be grouped into 1 or more bands for data transmission.
Each band may have a small number of subchannels.
Each user occupies only a small fraction of the uplink bandwidth.
45. Uplink OFDMA Tone Reservation Consider a single user
The non-data tones can be used as reserved tones to help reduce the PAR at the mobile transmitter.
Problem: these subchannels are used by other users in the uplink.
Result: multiple access interference (MAI).
Limit MAI with a PSD mask for PAPR reduction signal
46. Tone Reservation Comparison Originally developed for the DSL case
Large # of data tones, smaller # of reserved tones
Large amount of energy needed in each reserved tone to obtain good PAR reduction.
UL-OFDMA Case: Exact opposite of DSL case!
Small # data tones, very large # of reserved tones
Small amount of energy needed in each reserved tone to get good PAR reduction.
47. Example: 16-tone band, N = 512
48. Proposed PSD Mask
50. Part 2: Transmit Power Allocation What is resource allocation? What are our resources?
Power and data rate allocated to each subchannel
Subchannels are a resource when dealing with multiuser OFDM systems. (users allocated particular subchannels)
Resource allocation: mathematical problem of optimizing some benefit from the resources available.
Optimization problem: objective function (benefit) with constraints (finite resources and performance criteria)
51. Channel-State Information If the receiver can estimate channel gains/SNRs or channel statistics, the information can be sent back to the transmitter.
The transmitter can optimize performance given this additional information.
Good subchannels get more power and/or rate, bad subchannels get less power and/or rate.
Must maintain certain performance on each subchannel
52. Important Practical Issues How reliable is the CSI?
How often does CSI need to be fed back.
How fast does the channel change?
Can latency be overcome?
Feedback takes up bandwidth.
Quantized feedback is more practical. How to do it optimally? Vector quantization?
Key Point: knowing that one subchannel is better than another allows the system to gain an advantage.
53. Resource Allocation Examples Single user systems:
Send the most amount of information across channel with a total power constraint.
Use the least amount of power to send a fixed rate.
Single-user problems are easier and usually “nicer” optimization problems (convexity).
Multiuser systems:
In addition to the above, a guarantee that each user is given a minimum total rate.
Multiuser problems can be non-convex, but can be solved efficiently with convex approximations (which are very good!)
Can be difficult in certain situations, such as DSL with interference from crosstalk.
54. Motivation: Water-Pouring Information theory result for Gaussian channels.
Achieve capacity for a given total power constraint.
Consider the channel-to-noise ratio (CNR) to be
Solution: “Pour water” onto the inverted channel until the power budget (water) is exhausted.
Higher CNR portions get more water (power). Lower CNR get little or even no water (power)
55. Water-Pouring Example
56. Water-Pouring Example
57. Water-Pouring Notes Same concept can be applied to a discrete set of parallel channels (our OFDM case!)
Problem: water-pouring gives the optimal power allocation for capacity, but does not tell you what the codes needed are.
Optimal power allocation is similar, but not exactly the same as a problem with BER constraints.
Rates in each channel may have infinite granularity (as opposed to nice integer numbers).
58. Single User Point-to-Point Problem Maximize system performance:
Example: Send the most amount of information across channel.
Example: Use the least amount of power to send a given data rate.
Real-time system
Computational complexity is important. Depends how often we need to re-optimize the resource allocation.
System Resources
Power and rate in each subchannel.
Possible constraints
Maximum total allocated power (? of subchannel powers)
Target total rate constraint
QoS constraints: Bit/Symbol error probability
BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
59. DMT Loading Example
60. Our Goal Would like to obtain truly optimal solutions with minimal complexity, and therefore avoid:
Rounding (many ad hoc approaches have been published)
SNR Gap Approximation (infinite QAM approximation)
Costly re-computing or sorting subchannel gains (O(N logN) )
Suboptimal solutions can be very good, but if an optimal solution is efficient to obtain, then that’s great! BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
61. Optimization Problems Rate Maximization
Maximize the total rate subject to a power budget.
Margin Maximization
Minimize the total power to meet a target total rate. BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
62. Input Parameters (Sub)Channel-to-noise ratio:
Physical parameters from the environment.
is the channel power gain in the ith subchannel
is the AWGN noise power in the ith subchannel
Estimated at the receiver, fed back to the transmitter.
Transmitter solves the optimization problem.
Signal-to-noise ratio:
BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
63. Rate-Power Functional Relationship
Assume rate is a concave function of power.
Scalability: key property resulting in significant computational reduction.
The subchannel power-rate operating curves are scaled versions of one another. BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
64. Unconstrained Optimization Problem Margin Maximization
Goal: Find the ? whose minimum Lagrange cost meets the Rtarget constraint. A basic approach is:
Given a ?, solve for the minimum Lagrange cost, compute the total rate it obtains, and update \lambda based upon this result.
We’re interested in problems with a discrete set of allowable rates (in bits/symbol).
We then have a discrete, separable convex resource allocation problem (DSCRAP).
BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
65. Composite Power-Rate Curve
66. Unconstrained Optimization Problem Rate Maximization
Goal: Find the ? whose minimum Lagrange cost meets the power budget as best as possible.
Given a ?, solve for the minimum Lagrange cost, compute the total rate it obtains, and update lambda based upon this result.
We’re interested in problems with a discrete set of allowable rates (in bits/symbol).
We then have a discrete, separable convex resource allocation problem (DSCRAP).
BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
67. Dual Solution Find rates and powers such that for all subchannels,
Then iterate to find optimal lambda
68. Subchannel Power-Rate Curve
69. Observations The optimal operating point for each subchannel is where a line of slope
is tangent to the rate-power curve.
Result is same-slope solution for each user.
In the discrete case, derivatives do not exist!
Derivatives are generalized to differentials to obtain the operating points.
70. Discrete Subchannel Allocation
71. Non-Uniqueness of Allocation
72. Allocation Boundaries (Lookup Table)
73. Computational Complexity Requires subchannel to have all possible powers and its own individually computed lookup table requiring O(ND) computations.
When channel conditions change, an entirely new set of powers and lookup-tables need to be re-computed.
We can do better than this.
74. Finding Optimal Lambda Finding an optimal ?* is achieved with a bisection method.
Given an upper and lower ?, and choose a test a ? value between them (then update boundary).
A slower converging bisection method actually requires less overall computation!!!
Obtain new ? by averaging the upper and lower ?
BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
75. Bisection Method for Optimal ?* Start initially with and such that
Update the Lagrange multiplier:
If , then replace the “low” allocation
If , then replace the “high” allocation
76. Bisection Method
77. Bisection Properties Guaranteed improvement at each iteration
Fast convergence
Even with bad initial “guesses”, typically requires about log2(DN) iterations.
Easily known when the optimal dual solution has been reached.
Iteration gives no improvement.
Offers significant computational advantage in terms of subchannel convergence.
78. Example: CNR
79. Example: Low ? Power vs. Iteration
80. Example: Optimal Rate Allocated
81. Example: Final Power Allocated
82. Extensions to Difficult Problems Multi-User
Total power constraint or target rate constraint
MIMO
Total power constraint (per antenna?)
Fading channels
Outage constraints?
Limited Feedback
For practical use.
Given codebook, how to choose best feedback?
How to choose the best codebook?
83. Fading Channels Can possibly model as a composite fading channel (slow and fast fading, which are uncorrelated)
Depending on nature of slow fading, could have subchannel variation in average SNR.
Simple extension to this scenario.
84. Fading Phenomena
85. Slow fading tracked/predicted, equalised
Fast fading produces stochastic SNR
Average error probability
The effect of the PDF (e.g. Raleigh fading) can be incorporated. Composite Fading
86. MIMO-OFDM Loading Each Tx antenna has its own power amplifier (PA).
Problem: Output is limited by the PA’s backoff level
If the average power is too high, nonlinear PAPR effects (inherent in OFDM) may exceed a tolerance level.
i.e, cannot have disproportionately large antenna power(s)
Result: Power-constrained antennas in the optimization problem.
Practical constraint that helps ensure clean transmission
87. Notation Denote a spatial subchannel pair by (l,n)
Spatial domain powers/rates denoted with ~
88. New Rate-Max Optimization Problem Per-antenna power constraints
Objective function in spatial domain while power constraints in antenna domain.
89. Is the New Problem Convex???? YES!! To show how, we rewrite the problem in terms of only spatial powers and rates.
The nth spatio-subchannel power vector is
is the power gain transformation matrix whose entries can easily be shown to be: BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
90. Rewritten New Optimization Problem Spatial powers are a convex function of spatial rates.
Convex constraints due to non-negative weights
91. Unconstrained Optimization Problem
For a given , easy to determine rates/powers.
Need to find an optimal solving original constraints. BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
92. Minimizing the Lagrange Constraints Basic loading algorithm is
(1) Choose a
(2) Evaluate it in spatial domain
(3) Check total powers in antenna domain
(4) Update and reiterate
93. Approaches to Find Optimal _ Difficulty: multi-dimensional Lagrange search
Ellipsoid method can be used to find optimal solution.
Can be a costly approach with large LT
A simple branch-and-bound can be used to get very close to the optimal solution with LT = 2
Don’t expect many transmit antennas, so this can keep things practical.
BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.BLOCK-BASED METHOD.
ASSUMPTION keeps things simple, but the proposed can work without it as well.
94. Multiuser OFDM Power Allocation The number of free variables is now K times larger. Is the problem more difficult?
Slightly with a single total power or rate constraint
Much more so with individual power or rate constraints (multidimensional lagrange space)
Let’s start simple
95. WSRmax Discrete Optimization Problem
96. Optimization Problem Notes Non-convex in general due to exclusive subchannel constraint
Let’s instead solve a convex approximation!
Implicitly invoking the QoS constraint
Power for a given Rate is fixed by the QoS constraint and CNR.
Have written , but left off QoS and CNR for notational simplicity.
Thus, considering only valid pairs of
We refer to these pairs as Operating Points
97. Initial Works Problem was solved using the dual decomposition in the continuous case by Seong, Mosheni, & Cioffi (2006)
Used capacity instead of rate which made computing the solution easier.
An ergodic WSRmax approach was proposed in Wong and Evans (2007) for the continuous and discrete case.
Power is limited on average, not instantaneously
Shows improvement only at low SNRs
Somewhat expensive/complicated approach.
We are working on a simpler way to approach this problem.
98. Discrete Approach Very practical as it deals with a discrete set of rates. No suboptimal rounding from continuous solutions
Very computational efficient: fully exploits the structure of the optimization problem.
No sorting of subchannel gains required.
Guaranteed optimality (up to a very small worst case duality gap).
Virtually all other approaches use SNR-gap and other approximations (including rounding of continuous solution).
Flexibility: Can incorporate fading, multiple services and QoS constaints
99. Lagrangian and Dual Problem
100. Dual Problem Search to find the optimal Lagrange multiplier.
For any given test , must maximize the Lagrangian to obtain
Can compare the total power to the constraint to see how close we are.
101. Computing g(?) Maximizing the Lagrangian is an unconstrained optimization problem.
Subchannels do not affect one another and become separable (optimize them individually)
Hence, maximize the Lagrangian contribution from each subchannel, then sum them.
Efficient approach is detailed in the paper.
102. Simplifying g(?)
103. Solving for g(?)
104. Observations Essentially, this is computing K single-user OFDM bit loadings, followed by choosing the best user on each subchannel.
This may need to be repeated many times to reach an optimal ?* that minimizes g(?)
Would like an efficient approach that minimizes computation and searching.
105. Min Power Allocation Problem Minimum user rate constraints
Creates a complicated optimization problem.
Solving for dual solution is difficult as it requires searching a K-dimensional Lagrange search
The ellipsoid method can be used, but is very slow to converge.
106. Instantaneous Minimum Sum-Power Problem
107. Lagrangian and Dual Problem
108. “Simple” Simulation OFDMA system with 8 users, 128 subchannels
Channels for each user generated independently from 8-tap Rayleigh distribution with exponentially-decaying power profile.
Discrete rates available {0,1,2,3,4,5,6} bits/symbol.
QoS constraint: 10-3 symbol error prob
Rate constraint: 48 bits/symbol for each user
109. Convergence of Ellipsoid Method O(NK) complexity per iteration
Since instantaneous, must be redone whenever the channels change resulting in massive complexity.
110. Motivation: Ergodic Loading Previous I discussed the instantaneous loading problem
Optimize with respect current channel conditions.
Utilize all resources at given time.
Problem: Time-Varying Channels
Constantly need to re-optimize as the instantaneous channel gains change over time.
May be impractical depending upon mobility.
Alternative: Ergodic Loading
Problem is relaxed by replacing instantaneous powers and rates with ergodic ones.
Satisfy rate or power constraint on the average.
Maximize the average rate in Rate-Max Problem.
111. Ergodic Rate-Max Problem
QoS constraints are enforced on a symbol-by-symbol basis
Discrete-set of rates available as opposed to continuous loading.
Ergodic approach exploits temporal dimension
E.g., don’t waste power when the channel is poor.
E.g., use a lot of power when the channel is very good.
112. Dual Problem and Solution (RateMax) Lagrangian
Dual Problem
113. Ergodic Optimal Solution Key Point: Characterized by an optimal ?*
Solution valid as long as channel statistics remain the same.
Much slower time-scale than instantaneous channel
Optimization complexity is lower
At each time-instant, ?* is used to do an easy lookup table loading based on instantaneous channel gains
See Krongold (2000) for more details.
Effectiveness depends on channel coherence time and latency in obtaining CSI.
114. Previous Work
Wong and Evans (2006) proposed ergodic discrete loading for weight-sum-rate multiuser OFDM loading.
Similar to single-user link in that the Lagrange search is one dimensional.
Efficient approach based on order statistics to reduce multi-user computational load.
Easily altered to single-user point-to-point case.
Batch-based approach.
115. Adaptive Ergodic Loading Estimate ergodic total rate and power using empirical averaging.
Move from a batch-mode approach to an online-adaptive approach.
Update ? to converge to the optimal ?* and/or track the underlying channel statistics
116. Algorithm Lagrange update:
How to update ?
Previous powers based on previous ? so we might mix powers from different ? ‘s
Can re-calibrate using the previous M symbols at each ? update.
Expensive! Will use for comparison purposes.
117. Parameters ? : controls the averaging window.
Will not reach the truly optimal ergodic solution unless we average over all time.
? : controls ? convergence
They do in fact effect each other.
118. Simulations Rate-Max optimization problem, N=64 subchannels
Channel: 8-tap Rayleigh with exponentially-decaying power profile
Pmax is such that avg. received SNR = 12.35 dB.
Each iteration (symbol) has an independent channel realization.
Discrete rates: {0,1,2,3,4,5,6} bits/symbol.
QoS: 10-3 symbol error rate.
Optimal ergodic rate is 180.67 bits/symbol, and is 179.83 in the instantaneous case.
119. Average Power vs. Iteration #
120. Average Rate vs. Iteration #
121. 1/? vs. Iteration #
122. Thoughts and Conclusions Simple online adaptive approach for OFDM loading
Computational complexity spread evenly over time and less complex than instantaneous loading.
Adaptation and calibration memory are adjustable to balance convergence/tracking with computational complexity.
Lambda is less sensitive in the ergodic case then the instantaneous case.
Relatively easy to be very close to the optimal solution
Easily extendable to the multi-user case with a weighted-sum-rate maximization.