EE 551/451, Fall, 2006 Communication Systems

EE 551/451, Fall, 2006Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 15 Oct. 10th, 2006

Outline • Homework • Exam format • Second half schedule • Chapter 7 • Chapter 16 • Chapter 8 • Chapter 9 • Standards • Estimation and detection this class: chapter 14, not required • Estimation theory, methods, and examples • Detection theory, methods, and examples • Information theory next Tuesday: chapter 15, not required EE 541/451 Fall 2006

Estimation Theory • Consider a linear process y = H q + n y = observed data q = sending information n = additive noise • If q is known, H is unknown. Then estimation is the problem of finding the statistically optimal H, given y, q and knowledge of noise properties. • If H is known, then detection is the problem of finding the most likely sending information q, given y, H and knowledge of noise properties. • In practical system, the above two steps are conducted iteratively to track the channel changes then transmit data. EE 541/451 Fall 2006

Different Approaches for Estimation • Minimum variance unbiased estimators • Subspace estimators • Least Squares • Maximum-likelihood • Maximum a posteriori has no statistical basis uses knowledge of noise PDF uses prior information about q EE 541/451 Fall 2006

Least Squares Estimator • Least Squares: qLS = argmin ||y – Hq||2 • Natural estimator– want solution to match observation • Does not use any information about noise • There is a simple solution (a.k.a. pseudo-inverse): qLS = (HTH)-1 HTy • What if we know something about the noise? • Say we know Pr(n)… EE 541/451 Fall 2006

Maximum Likelihood Estimator • Simple idea: want to maximize Pr(y|q) • Can write Pr(n) = e-L(n) , n = y – Hq, and Pr(n) = Pr(y|q) = e-L(y, q) • if white Gaussian n, Pr(n) = e-||n||2/2 s2 and L(y, q) = ||y-Hq||2/2s2 qML = argmax Pr(y|q) = argmin L(y, q) • called the likelihood function qML = argmin ||y-Hq||2/2s2 • This is the same as Least Squares! EE 541/451 Fall 2006

Maximum Likelihood Estimator • But if noise is jointly Gaussian with cov. matrix C • Recall C ,E(nnT). Then Pr(n) = e-½ nT C-1 n L(y|q) = ½ (y-Hq)T C-1 (y-Hq) qML = argmin ½ (y-Hq)TC-1(y-Hq) • This also has a closed form solution qML = (HTC-1H)-1 HTC-1y • If n is not Gaussian at all, ML estimators become complicated and non-linear • Fortunately, in most channel noise is usually Gaussian EE 541/451 Fall 2006

Estimation example - Denoising • Suppose we have a noisy signal y, and wish to obtain the noiseless image x, where y = x + n • Can we use Estimation theory to find x? • Try: H = I, q = x in the linear model • Both LS and ML estimators simply give x = y! •  we need a more powerful model • Suppose x can be approximated by a polynomial, i.e. a mixture of 1st p powers of r: x = Si=0p ai ri EE 541/451 Fall 2006

H y y1 y2 M yn 1 r11L r1p 1 r21L r2p • M 1 rn1L rnp n1 n2 M nn a0 a1 M ap = + q Example – Denoising Least Squares estimate: q LS = (HTH)-1HTy x = Si=0p ai ri EE 541/451 Fall 2006

Bayes Theorem: • Pr(x|y) = Pr(y|x) Pr(x) • Pr(y) Maximum a Posteriori (MAP) Estimate • This is an example of using a signal prior information • Priors are generally expressed in the form of a PDF Pr(x) • Once the likelihood L(x) and prior are known, we have complete statistical knowledge • LS/ML are suboptimal in presence of prior • MAP (aka Bayesian) estimates are optimal likelihood posterior prior EE 541/451 Fall 2006

Maximum a Posteriori (Bayesian) Estimate • Consider the class of linear systems y = Hx + n • Bayesian methods maximize the posterior probability: Pr(x|y) ∝ Pr(y|x) . Pr(x) • Pr(y|x) (likelihood function) = exp(- ||y-Hx||2) • Pr(x) (prior PDF) = exp(-G(x)) • Non-Bayesian: maximize only likelihood xest = arg min ||y-Hx||2 • Bayesian: xest = arg min ||y-Hx||2 + G(x) , where G(x) is obtained from the prior distribution of x • If G(x) = ||Gx||2 Tikhonov Regularization EE 541/451 Fall 2006

Expectation and Maximization (EM) • Expectation and Maximization (EM) algorithm alternates between performing an expectation (E) step, which computes an expectation of the likelihood by including the latent variables as if they were observed, and a maximization (M) step, which computes the maximum likelihood estimates of the parameters by maximizing the expected likelihood found on the E step. The parameters found on the M step are then used to begin another E step, and the process is repeated. • E-step: Estimation for unobserved event (which Gaussian is used), conditioned on the observation, using the values from the last maximization step. • M-step: You want to maximize the expected log-likelihood of the joint event EE 541/451 Fall 2006

Minimum-variance unbiased estimator • Biased and unbiased estimators • An unbiasedestimator of parameters, whose variance is minimized for all values of the parameters. • The Cramer-Rao Lower Bound (CRLB) sets a lower bound on the variance of any unbiased estimator. • Biased estimator might have better performances than unbiased estimator in terms of variance. • Subspace methods • MUSIC • ESPRIT • Widely used in RADA • Helicopter, Weapon detection (from feature) EE 541/451 Fall 2006

What is Detection • Deciding whether, and when, an event occurs • a.k.a. Decision Theory, Hypothesis testing • Presence/absence of signal • RADA • Received signal is 0 or 1 • Stock goes high or not • Criminal is convicted or set free • Measures whether statistically significant change has occurred or not EE 541/451 Fall 2006

Detection • “Spot the Money” EE 541/451 Fall 2006

Hypothesis Testing with Matched Filter • Let the signal be y(t), model be h(t) Hypothesis testing: H0: y(t) = n(t) (no signal) H1: y(t) = h(t) + n(t) (signal) • The optimal decision is given by the Likelihood ratio test (Nieman-Pearson Theorem) Select H1 if L(y) = Pr(y|H1)/Pr(y|H0) > g otherwise select H0 EE 541/451 Fall 2006

Signal detection paradigm • Signal trials • Noise trials EE 541/451 Fall 2006

Signal Detection EE 541/451 Fall 2006

Receiver operating characteristic (ROC) curve EE 541/451 Fall 2006

g(t) y(T) x(t) y(t) h(t) t = T Pulse signal Matched filter w(t) Matched Filters • Optimal linear filter for maximizing the signal to noise ratio (SNR) at the sampling time in the presence of additive stochastic noise • Given transmitter pulse shape g(t) of duration T, matched filter is given by hopt(t) = kg*(T-t) for allk EE 541/451 Fall 2006

Questions? EE 541/451 Fall 2006

EE 551/451, Fall, 2006 Communication Systems