Review of Probability

Review of Probability

Probability Theory: • Many techniques in speech processing require the manipulation of probabilities and statistics. • The two principal application areas we will encounter are: • Statistical pattern recognition. • Modeling of linear systems.

Events: • It is customary to refer to the probability of an event. • An event is a certain set of possible outcomes of an experiment or trial. • Outcomes are assumed to be mutually exclusive and, taken together, to cover all possibilities.

Axioms of Probability: • To any event A we can assign a number, P(A), which satisfies the following axioms: • P(A)≥0. • P(S)=1. • If A and B are mutually exclusive, then P(A+B)=P(A)+P(B). • The number P(A) is called the probability of A.

Axioms of Probability (some consequence): • Some immediate consequence: • If is the complement of A, then • P(0) ,the probability of the impossible event, is 0. • P(A) ≤ 1. • If two event A and B are not mutually exclusive, we can show that • P(A+B)=P(A)+P(B)-P(AB).

Conditional Probability: • The conditional probability of an event A, given that event B has occurred, is defined as: • We can infer P(B|A) by means of Bayes’ theorem:

Independence: • Events A and B may have nothing to do with each other and they are said to be independent. • Two events are independent if P(AB)=P(A)P(B). • From the definition of conditional probability:

Independence: • Three events A,B and C are independent only if:

Random Variables: • A random variable is a number chosen at random as the outcome of an experiment. • Random variable may be real or complex and may be discrete or continuous. • In S.P. ,the random variable encounter are most often real and discrete. • We can characterize a random variable by its probability distribution orby itsprobability density function (pdf).

Random Variables (distribution function): • The distribution function for a random variable y is the probability that y does not exceed some value u, • and

Random Variables (probability density function): • The probability density function is the derivative of the distribution: • and,

Random Variables (expected value): • We can also characterize a random variable by its statistics. • The expected value of g(x) is written E{g(x)} or <g(x)> and defined as • Continuous random variable: • Discrete random variable:

Random Variables (moments): • The statistics of greatest interest are the moment of X. • The kth moment of X is the expected value of . • For a discrete random variable:

Random Variables (mean & variance): • The first moment, ,is the mean of x. • Continuous: • Discrete: • The second central moment, also known as the variance of p(x), is given by

Random Variables …: • To estimate the statistics of a random variable, we repeat the experiment which generates the variable a large number of times. • If the experiment is run N times, then each value x will occur Np(x) times, thus

Random Variables (Uniform density): • A random variable has a uniform density on the interval (a, b) if :

Random Variables(Gaussian density): • The Gaussian, or normal density function is given by:

Random Variables (…Gaussian density): • The distribution function of a normal variable is: • If we define error function as • Thus,

Two Random Variables: • If two random variables x and y are to be considered together, they can be described in terms of their joint probability density f(x, y) or, for discrete variables, p(x, y). • Two random variable are independent if

Two Random Variables(…Continue): • Given a function g(x, y), its expected value is defined as: • Continuous: • Discrete: • And joint moment for two discrete random variable is:

Two Random Variables(…Continue): • Moments are estimated in practice by averaging repeated measurements: • A measure of the dependence of two random variables is their correlation and the correlation of two variables is their joint second moment:

Two Random Variables(…Continue): • The joint second central moment of x , y is their covariance: • If x and y are independent then their covariance is zero. • The correlation coefficient of x and y is their covariance normalized to their standard deviations:

Two Random Variables(…Gaussian Random Variable): • Two random variables x and y are jointly Gaussian if their density function is : • Where

Two Random Variables(…Sum of Random Variables): • The expected value of the sum of two random variables is : • This is true whether x and y are independent or not • And also we have :

Two Random Variables(…Sum of Random Variable): • The variance of the sum of the two independent random variable is : • If two random variable are independent, the probability density of their sum is the convolution of the densities of the individual variables : • Continuous: • Discrete:

Central Limit Theorem • Central Limit Theorem (informal paraphrase): If many independent random variable are summed, the probability density function (pdf) of the sum tends toward the Gaussian density, no matter what their individual densities are.

Multivariate Normal Density • The normal density function can be generalized to any number of random variables. • Let X be the random vector, • Where • The matrix R is the covariance matrix of X (R is Positive-Definite)

Random Functions : • A random function is one arising as the outcome of an experiment. • Random function need not necessarily be functions of time, but in all case of interest to us they will be. • A discrete stochastic process is characterized by many probability density of the form,

Random Functions : • If the individual values of the random signal are independent, then • If these individual probability densities are all the same, then we have a sequence of independent, identically distributed samples (i.i.d.).

mean & autocorrelation • The mean is the expected value of x(t) : • The autocorrelation function is the expected value of the product :

ensemble & time average • Mean and autocorrelation can be determined in two ways: • The experiment can be repeated many times and the average taken over all these functions. Such an average is called ensemble average. • Take any one of these function as being representative of the ensemble and find the average from a number of samples of this one function. This is called a time average.

ergodicity & stationarity • If the time average and ensemble average of a random function are the same, it is said to be ergodic. • A random function is said to be stationary if its statistics do not change as a function of time. • Any ergodic function is also stationary.

ergodicity & stationarity • For a stationary signal we have: • Where • And the autocorrelation function is :

ergodicity & stationarity • When x(t) is ergodic, its mean and autocorrelation are :

cross-correlation • The cross-correlation of two ergodic random functions is : • The subscript xy indicates a cross-correlation.

Random Functions (power & cross spectral density): • The Fourier transform of (the autocorrelation function of an ergodic random function) is called the power spectral density of x(t) : • The cross-spectral density of two ergodic random functions is :

Random Functions (…power density): • For an ergodic signal x(t), can be written as: • Then from elementary Fourier transform properties,

Random Functions (White Noise): • If all values of a random signal are uncorrelated, • Then this random function is called white noise • The power spectrum of white noise is constant, • White noise is a mixture of all frequencies.

Random Signal in Linear Systems : • Let T[ ] represent the linear operation; then • Given a system with impulse response h(n), • A stationary signal applied to a linear system yields a stationary output,

Review of Probability