1 / 18

Personal Information

Personal Information. Name: Li, Zongge 李宗葛 Group: Automation Office: 505 Computer Building Tel: 65642071(o) 65250749(h) Fax: 65642071 Email: zgli@fudan.edu.cn. Part I Speech Representation, Models and Analysis. This part covers chapter 1 to chapter 6.

Download Presentation

Personal Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Personal Information Name: Li, Zongge 李宗葛 Group: Automation Office: 505 Computer Building Tel: 65642071(o) 65250749(h) Fax: 65642071 Email: zgli@fudan.edu.cn

  2. Part I Speech Representation, Models and Analysis This part covers chapter 1 to chapter 6. It is general for all kinds of speech application.

  3. Chapter 1 Fundamentals of Digital Speech Processing and Probability Theory (1) • 1.1 Discrete-Time Signals and Systems • 1.2 Transform Representation of Signals and Systems • 1.2.1 The Z-Transform • 1.2.2 The Fourier Transform • 1.2.3 The Discrete Fourier Transform • 1.3 Fundamentals of Digital Filters • 1.3.1 FIR Systems • 1.3.2 IIR Systems • 1.4 Sampling • 1.4.1 The Sampling Theorem • 1.4.2 Decimation and Interpolation of Sampled Waveforms

  4. Fundamentals of Digital Speech Processing and Probability Theory (2) • 1.5 Basics of Probability Theory • 1.5.1 Probability of Events • 1.5.2 Random Variables and Its Distribution • 1.5.3 Mean and Variation • 1.5.4 Covariance and Correlation • 1.5.5 Random Vector and Multivariate Distribution • 1.6 Basics of Information Theory • 1.6.1 Entropy • 1.6.2 Condition entropy • 1.6.3 Source and channel coding theorems • 1.7 Basics of Stochastic Process • 1.7.1 Stochastic Process and Its Distributions • 1.7.2 Numeral Characteristics of Stochastic Process • 1.7.3 Stationary stochastic process • 1.8 Problems

  5. 1.1 Discrete-Time Signals and Systems • Original speech signal is a continuous time function xa(t) • After sampling, we have discrete sequence x(n) = xa(nT) • Signal processing involves the transformation of a signal into the desired form • Single input/single output system y(n) = T[x(n)] • Single input/multiple output system y(n) = T[x(n)] • For linear shift-invariant system : • T[a1x1(n)+a2x2(n)]=a1T[x1(n)]+a2T2[x(n)] linear • y[n-n0] = T{x[n-n0]} time-invariant • y(n)=Σk=0n x(k)h(n-k)= Σk=0n x(n-k)h(n) • =x(n)*h(n) (convolution)

  6. 1.2 Transform Representation of Signals and Systems • 1.2.1 The Z-Transform of a sequence x(n) • X(z)=Σn=0 ∞ x(n)z-n x(n) = ∮X(z)zn-1dz / (2πj) • 1.2.2 The Fourier Transform of a discrete-time signal • X(ejω) = Σ n=0 ∞ x(n) e-jωn • x(n) = ∫-∞∞ X(ejω) ejωn dω/ (2π) • 1.2.3 The Z-Transform of a finite length sequence x(n) • X(k) = Σ n=0 N-1 x(n)e -j2πkn /N k = 0, 1, …, N -1 • x(n) = Σ k=0 N-1 X(k) e j2πkn /N / N n = 0, 1, …, N -1

  7. 1.3 Fundamentals of Digital Filters (1) • A digital filter is a discrete-time linear shift-invariant system for which • Y(z) = H(z) X(z) where H(z) is called system function • H(ejω) is called frequency response, it is a complex: • H(ejω) = Hr(ejω) +j Hi(ejω) or • H(ejω) = |H(ejω)| exp {jarg[H(ejω)]} • The inverse of H(ejω) is the impulsive response : • h(n)= ∫-∞∞ H(ejω) ejωn dω/ (2π) • The input and output of a filter satisfy : • y(n) – Σk=1m aky(n-k) = Σr=0M brx(n-r)

  8. Fundamentals of Digital Filters (2) • 1.3.1 FIR( Finite Impulse Response ) Systems • All ak are 0 : y(n) = Σr=0M brx(n-r) • we have h(n) = bn 0 <= n <= M or 0 otherwise • There is no nonzero poles, only zeros for them • And they have exactly linear phase • 1.3.2 IIR( Infinite Impulse Response ) Systems • Not all ak are 0 : y(n) = Σaky(n-k) + Σbrx(n-r) • They have both poles and zeros and infinite duration • It is more efficiently to implement than an FIR

  9. 1.4 Sampling • x(n) = xa(nT) -∞ < n < ∞ , n is an integer • T is called the sampling period • 1.4.1 Shannon Sampling Theory • If a signal xa(t) has a bandlimited Fourier transform Xa(jΩ): • Xa(jΩ) = 0 for Ω>= 2πFN( Nyquist Frequency ) • Then xa(t) could be reconstructed from x(n) if 1/T > 2FN • 1.4.2 Decimation and Interpolation of sampled waveforms • New Sampling period T’ = MT ( reduced sampling rate ) • Sampled sequence is y(n) = xa(nT’) = xa(nMT) = x(Mn) • New Sampling period T’ = T/L ( increased sampling rate ) • Sampled sequence is y(n) = xa(nT’) = xa(nT/L) = x(n/L)

  10. 1.5 Basics of Probability Theory(1) • 1.5.1 Probability of event ( result of a test )P(A) = lim NA/Ns where Ns is the total times of the testNAis the times event A occurred, limit does exist when test number is big enoughP(AB) = P(B|A)P(A) = P(A|B)P(B) P(A1A2…An) = P(An|A1…An-1)…P(A2|A1)P(A1) • 1.5.2 Probability of Random Variable X fx(x) = P(X=x) the probability function for discrete random variableP(X<=x)=∫-∞x f(x)dx where f(x) is probability density function (continuous variable), P is distribution function

  11. Basics of Probability Theory (2) • 1.5.3 Mean and varianceE(X) = ∫-∞∞ x f(x)dx expectation or meanVar(X) = ∫-∞∞(x - E(X))2 f(x)dx variance • 1.5.4 Bayes Theorem P(X) =Σi=1M P(X|Ci)P(Ci), P(Ci|X)P(X)=P(X|Ci)P(Ci) P(Ci|X) = P(X|Ci)P(Ci)/P(X)

  12. Basics of Probability Theory (3) • 1.5.5Covariance and Correlation • Covariance: Cov(X, Y) = E[(X-μx)(Y-μy)] if X and Y are random variables having a specific joint distribution, and E(X) = μx, E(Y) = μy, Var(X) = σx2 , Var(Y) = σy2 • Correlation coefficient of X and Y: ρxy = Cov(X,Y)/(σxσy ), -1<= ρxy <= 1 • 1.5.6 Random Vector and Multivariate Distributions • X = (X1, …, Xn), fx(x1, …, xn) = P(X1=x1, …, Xn=xn) • Vector form: X = [ X1 X2 … Xn ]’ E[X] = [ E[X1] E[X2] … E[Xn] ] • Cov(X1, X1) … Cov(X1, Xn) • Cov(X) = …… … …… it is the covariance matrix • Cov(Xn, X1) … Cov(Xn, Xn)

  13. 1.6 Basics of Information Theory(1) • 1.6.1 Entropy • Random variable xi has amount of information: I(xi) = log(1/P(xi)) • Information source S has average amount of informationH(S)=ΣiP(xi)I(xi)= ΣiP(xi) log(1/P(xi)) = E[-logP(xi)] • It is the entropy of information source S, H(S) >= 0 • 1.6.2 Condition Entropy • When X=(x1, x2, …, xs) is the input symbols of channel and Y = (y1, y2, …, yl) is the output symbols, the information channel is defined by Mij = P(yj|xi)Before transmission, the uncertainty of X is H(X). • Suppose yj is received, the uncertainty of X is reduced to H(X|Y=yj), H(X,Y) = H(X) + H(Y|X) • H(X1, …, Xn) = H(Xn|X1, …, Xn-1) + … + H(X2|X1) + H(X1)

  14. Basics of Information Theory(2) • 1.6.3 Source and channel coding theorems and mutual information • Shannon’s source coding theorem says that a source cannot be coded with fewer bits than its entropy. • Channel coding theorem: I(X;Y) = H(X) - H(X|Y) is called mutual information • I(X;Y) = I(Y;X)=E[P(X,Y)/[P(X)P(Y)]] • 0<= I(X;Y) <= min [H(X), H(Y)] • C = max I(X,Y) • Shannon’s channel coding theorem says that for a given channel there exists a code that will permit the error-free transmission across the channel, provided R <= C, where R is the rate of the communication system

  15. 1.7 Basics of Stochastic Process (1) • 1.7.1 Stochastic Process and Its Distributions • ξ(t) is a stochastic process (time function), ξ(t1) is a random variable • One dimension probability distribution function : F1(x1, t1) = P[ξ(t1) ≤ x1] • One dimension probability density function : f1 (x1, t1) = δ F1(x1, t1) / δx1 • Expansion to n-dimension: Fn(x1, x2, …, xn; t1, t2, …, tn) and fn(x1, x2, …, xn; t1, t2, …, tn)

  16. Basics of Stochastic Process (2) • 1.7.2 Numeral Characteristics of ξ(t) • The mathematical expectation of ξ(t) at time t: a(t) = E{ξ(t)} = ∫-∞∞ x f1(x;t)dx • The variance of ξ(t) at time t: σ2(t) = E{[ξ(t) - a(t)]2} = ∫-∞∞ x2 f1(x;t)dx – [a(t)]2 • The correlation function of ξ(t) : R(t1, t2)=E{[ξ(t1)ξ(t2)]}=∫-∞∞ ∫-∞∞x1x2f 2(x1,x2;t1,t2) dx1dx2 • 1.7.3 Stationary stochastic process • Definition: A ξ(t) is stationary if for any n and h, • fn(x1, x2, …, xn; t1, t2, …, tn) = fn(x1, x2, …, xn; t1+h, t2+h, …, tn+h) • For this process R(t1, t2) = R(t2- t1) = R(τ)

  17. 1.8 Problems(1) • 1.8.1 Use any software tool to get a speech data file (.wav) with sampling rate 8kHz and 16bit accuracy. Display it on the screen and play it by audio. • 1.8.2 If a device (or an algorithm) has following output y(n) = x(n) - αx(n-1), where x(n) is the input, α≈ 0.95-1.0. Please discuss the main function of the device. • 1.8.3 Make a program to draw the |H(ω)| vs the ω. Use db as unit of |H(ω)| .

  18. Problems(2) • 1.8.4 Find the z-transform and the Fourier transform of each of following sequences :(1) Rectangular window w1(n) = 1 0 ≤ n ≤ N - 1; = 0 otherwise(2) Hamming window w2(n) = 0.54 - 0.46 cos[2πn/(N - 1)] 0 ≤ n ≤ N - 1 = 0 otherwise(3) Hanning window w3(n) = 0.5 {1 - cos[2πn/(N - 1)]} 0 ≤ n ≤ N - 1 = 0 otherwise

More Related