250 likes | 264 Views
This article presents an algorithm designed in MATLAB for automating tracing of articulators in speech acoustics images, with a focus on the tongue. The algorithm incorporates techniques from the EdgeTrak System, Chan-Vese, and snake methods. The results show promising energy minimization and potential for application to other articulators. Future work includes applying the model to consecutive frames and exploring intensity-based external energy methods.
E N D
Erin Plasse Advisors: Professor Hanson Professor Rudko Image Processing Algorithm for Speech Acoustics
Introduction • Experiment done in 1960’s by Kenneth Stevens and Dr. Sven Öhman in Sweden • Used a cineradiograph x-ray to take lateral images of the vocal tract • 31 utterances and 2 sentences were made • Analyzed how articulators displace over time • 45 frames/second
Image Processing • Perkell (1969)- Used manual methods to make tracings of the images
Typical Tracing Perkell (1969) used manual
Typical Analysis Perkell (1969) used manual measures
Goals & Parameters • Design an algorithm in MATLAB to automate the tracings using edge detection methods • Trace certain articulators, such as, lips, velum, epiglottis, hyoid bone, etc. • Results should be similar to the original tracings • Only 13 utterances were analyzed • Obtain tracings for the 20 utterances not analyzed by Perkell (1969) • Manual extraction is time consuming • Smooth and continuous curves
Design Alternatives • Snakes: Active Contour Models • Matlab script written by Eric Debreuve • Chan-Vese Region Based Segmentation Algorithm • Matlab script written by Shawn Lankton • EdgeTrak System for Ultrasound images • VIMS Lab, University of Delaware • Customize one of above to create own design for the data
Snakes: Active Contour ModelsMichael Kass • Snake: Energy minimizing spline guided by external forces • Image forces pull it toward lines and edges • MATLAB code written by Eric Debreuve • Only worked with binary images
Chan-Vese Algorithm • Region based segmentation • Use homogeneity of intensity in a region as the constraint • Only applicable to closed contours • Uses an initial mask region • MATLAB script written by Shawn Lankton
EdgeTrak System • Li, Kambhamettu, Stone • Uses gradient image forces and intensity information in local regions • Energy definition for snakes: • ETotal= α Eint+ β Eext • Energy band gap • External energy is redefined for EdgeTrak as: • E′ext(vi) = Eband(vi) •Eext(vi) • Not effective for closed contours • Good for tracking tongue in noisy images with high-contrast unrelated edges
Energy Minimization Band • Main contribution of EdgeTrak method, finds the intensity of the regions. • Energy band regions are found around each snake element • Find mean intensity difference between regions • Find new external energy using band energy • Minimize total energy using dynamic programming
The Final Design • Used methods from both the EdgeTrak System, Chan-Vese, and snake methods. • Implemented using MATLAB • Used only the image gradient to find edges • Tongue is the articulator that is focused on
MATLAB Code • User picks 5 points • 33 snake elements found using spline interpolation • Computes internal and external energy of initial snake elements • Computes internal and external energies of points surrounding each initial point • Finds the surrounding point with the lowest energy, this becomes new point • New contour is graphed
MATLABCode Demo %Edge_trak_demo %Coded by Erin Plasse
Final Results Alpha = .2, Beta = .8, Delta = 5 • Results- • Energy of original snake = -96.9553 • Energy of new snake = 1.2244 • Percent change Snake energy = 101.2629
E_snake_orig = -21.8775 E_snake_new = -0.5480 Percent_change_Snake_energy = 97.495
Initial Points Final Points
Future Work • Apply the contour model to a sequence of consecutive frames • Find more articulators • Use the intensity method for external energy as described in the Edge Trak program
References • Perkell, Joseph S.. Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Cambridge, MA: The MIT Press, 1969. • Stevens, Kenneth and Öhman, Dr. Sven. (1963). “Cineradiographic Studies of speech:procedures and objectives.” J. Acoust. Soc. Am., 35, 1889. • M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active Contour Models,” Int. J. Comput. Vis., vol. 1, pp. 321-331, 1988. • T.F. Chan, L.A. Vese. Active Contours Without Edges. IEEE Trans. On Img. Processing., vol. 10 , pp.266-277, 2001. • M. Li, C. Kambhametti, M. Stone. Automatic Contour Tracking in Ultrasound Images. 2004.