280 likes | 302 Views
Explore how machine learning models like Naive Bayesian and Stochastic OT compare to human learners in L2 sound perception development. Analyzing European Spanish learners of Dutch vowels with different proficiency levels to understand perceptual space and distances between vowels. Discussing models and results for beginners and advanced learners, highlighting the effectiveness of stochastic OT. Acknowledging the support of the Netherlands Organization for Scientific Research and research assistants.
E N D
Explaining L2 perceptual development: Machine learning vs. computational Stochastic OT vs. human learners Paola Escudero, Jelle Kastelein & Klara Weiand University of Amsterdam
Introduction • Comparison of models for L2 sound perception development • Part of the human data presented in the talk yesterday • Classical machine learning: Naive Bayesian, Nearest Neighbor • Stochastic OT: Linguistic theory
Listeners 23 European Spanish learners of Dutch 22 native Dutch adults Different proficiency levels according to the EU measure of language proficiency
Analysis • We measured the listeners’ perceptual space, i.e. the distance between the F1 & F2 values which they categorized as the 12 Dutch vowels • We first compute the mean and variation for the perception of each vowel ➝ ellipses • Then, we calculated the distances between the mean perception of the Dutch central vowel /ø/ and the mean perception of the other 11vowels • Here we present the variation and distances for the corner vowels /a/, /i/ and /u/ and the central vowel /ø/, statistics are performed on the 11 distances between vowels
Explaining L2 perception • Three different learning algorithms • Different levels of abstraction from the training input • Process: • Model a native listener of Spanish • Beginning learner of Dutch: Map responses of „native speaker“ to Dutch vowel space • Advanced learner: train native speaker model with native Dutch data
Nearest Neighbour • „Lazy learner“ • Training: Save examples in Euclidean space • Classification: Assign class most frequent among nearest neighbors • No abstraction from data
Naive Bayesian • Statistical model • Assumption: class of data point can be inferred from its attributes. Example: fruits • Training: Observe how often each class appears and what attribute values correspond to which class • Classification: Maximize vowel class probability given the attributes • Training data is abstracted into a stochastic model
Stochastic OT • Computational linguistic framework • Training: Best class is the class with least serious constraint violations, Constraint rankings are adapted according to training data • Classification: Select candidate class with least serious violations • More abstract than previous two, no explicit probabilities, but constraint rankings which reflect them
Human vs. simulated data Human: Solid red line OT: Solid black line Naive Bayes: Dashed line Nearest Neighbor: Dotted line
Results • Naive Bayesian is significantly different from human data (Wilcoxon Matched Pairs Signed Ranks test) • No significant difference between humans and Nearest Neighbor and stochastic OT
Results • No significant difference between humans and either classifier
Results • Nearest Neighbor differs significantly from the humans • No significant difference between humans and Naive Bayes and stochastic OT
Conclusion • The most abstract model, stochastic OT, gives the best results: it resembles humans in all simulations • Distance measure helps to quantify difference between vowels
Acknowledgements: Netherlands Organization for Scientific Research Research assistants: Jeannette Elsenburg, Annemarieke Samason, Titia Benders, Marieke Gerrits email: escudero@uva.nl kweiand@science.uva.nl