1 / 22

Investigating visual prosody using articulography

Investigating visual prosody using articulography. Johan Frid, Malin Svensson Lundmark, Gilbert Ambrazaitis (Linné), Susanne Schötz och David House ( Kth ) dHN 2019 , MAR 6, 2019. Self check: are we DHN?. From th e Call:. Background : EMA ( ElectroMagnetic Articulography ).

dena
Download Presentation

Investigating visual prosody using articulography

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Investigating visual prosody using articulography Johan Frid, Malin Svensson Lundmark, Gilbert Ambrazaitis (Linné), Susanne Schötzoch David House (Kth) dHN 2019, MAR 6, 2019

  2. Self check: arewe DHN? • From the Call:

  3. Background: EMA (ElectroMagneticArticulography) • Measurement +recording of movements in 3D • EM field and sensors attached to tongue, jaw, lips, head • Up to 16 sensors simultan • Sampfreq 1250 Hz (AG501) / 200 Hz (AG500) • Catches quick movements • LU Humanities lab

  4. Video

  5. 2D viewofmovementsduringspeechthroughtime (head and articulators) Nose Ear Lips (2 sensors) Tongue (3 sensors) Jaw

  6. Earlierproject: MUMOP/Swe-Clarin • Newsreaders • Audiovisualrecordings • Ambrazaitis & House (2017) • Head movements occur more often in the second part of a news event • To some extent dependent on information structure • Intial clause is the theme of the news • Frid, Ambrazaitis, Svensson-Lundmark & House (2017) • Machinelearningbaseddetectionofheadmovements

  7. Headmovementsusinganalysisof video recordings • Face detection using OpenCV • Position of face determined in eachframe • Black square is detected face, whitedotreflectsheadmovements • Butalso general bodymovements

  8. Currentproject: PROGEST • The production of prosodic prominence: integrating bodily and articulatory gestures • House, VR 2017-02140 • Multimodal prominence • Interplaybetween • Verbal prosody (rythm, intonation, intensity) • Visual prosody (gestures, head and face movements)

  9. UseEMAfor headmovements? • Compared to video • Movementdata in 3D • Bettertimeresolution: EMA has 1250/200 measurements/s • Betterprecision : directregistrationinsteadof post processing • Bettersyncbetween sound and movement • Canalsemeasuretonguemovements • Disadvantages • has to be done on-line • rathercomplicatedprocedure • 2 studies on headmovement and itsinterplaywithprosody • reuseof old (’found’) data

  10. ’sagittal’ angle (=head nod)

  11. Angleofhead nod Nose Ear Lips (2 sensors) Tongue (3 sensors) Jaw

  12. angle+ how it changes (velocity) velocityanglewaveform Interpretation: positive meanvelocity head is tiltedupwards negative meanvelocity  head is tilteddownwards

  13. Study1: material from VOKART (Schötz) • Dialectal variation, mainly in vowelarticulation • 12 sensors • 29 speakers • 9 Stm, 10 Gbg, 10 Mmö • X men, Y women • Åldrar 20-63 • 3-4 reps oftwo read sentences

  14. Data • 3-4 readingsofsentences • 1) Mobiltelefonen är nittiotalets stora fluga, både bland företagare och privatpersoner. The mobile phone is the big hit of the nineties, both among business people and private persons. • 2) Flyget, tåget och bilbranschen tävlar om lönsamhet och folkets gunst. Airlines, train companies and the automobile industry are competing for profitability and people's appreciation. • Possibleprosodicboundaries • S1: Possiblephraseboundaryafterfluga • S2: starts with list intonation • Canweseeanyheadmovementsassociatedwiththese? • Exclusionsbecause bad or no sound, non-completesentences • 86 ex of 1), 80 ex of 2) • Semi-automaticsegmentationintoword by meansofforcedalignment in PRAAT

  15. Sentence 1: meanangvelocity per word, all speakers + repetitions (n=86)

  16. Sentence 2: meanvelocity, all speakers + repetitions (n=80)

  17. Linear mixed effects • Fixed effect: mav, Random effect: speaker, likelihoodratio tests • fluga – både • word affected mav (χ2 (1)=8.5201, p=0.003512), lowering it by about 0.077 rad/s ± 0.017 (standard errors) • stora – fluga • word affected mav (χ2 (1)=8.4946, p=0.003562), increasing it by about 0.077 rad/s ± 0.017 (standard errors) • Mobiltelefonen- nittitalet • word affected mav (χ2 (1)=5.8811, p=0.0153), lowering it by about 0.043 rad/s ± 0.012 (standard errors) • flyget – tåget • word affected mav (χ2 (1)=3.913, p=0.04792), lowering it by about 0.043 rad/s ± 0.017 (standard errors) • tävlar – om • word affected mav (χ2 (1)=4.3803, p=0.03636), lowering it by about 0.032 rad/s ± 0.012 (standard errors)

  18. Study2: PhD project byMalin Svensson Lundmark • 18 speakers, South Swedish dialect, ages 23-75 • 8 targetwords • Varied by word accent and vowellength • Embedded in QA-pairs like • Where did grandpa leave mom?Grandpa left mom with the doctor. • (in order to avoid ‘big’ accent on target word) • 8 reps/word, misreadingsetcremoved, in total 1092 tokens

  19. (work in progress…) • Exploratory: Canweseeanyheadmovementspatternsin these? • GAM analysis • (GeneralizedAdditiveModeling) • non-linear regression method • identify general patterns over dynamically varying data • See Wieling 2018

  20. Nose sensor, up-down, GAM modelofeffectoflocation 123

  21. Word accent + Vowellength

  22. Conclusions • Tendencies for • upwardmovementbeforephraseboundary • downwardmovementafterphraseboundary • noddingpatternsynced to vowel • differencesdepending on word accent and vowellength

More Related