Loading...
Thumbnail Image
Publication

Phonotactic Learning with Distributional Representations

Citations
Altmetric:
Abstract
This dissertation explores the possibility that the phonological grammar manipulates phone representations based on learned distributional class memberships rather than those based on substantive linguistic features. In doing so, this work makes three primary contributions. First, I propose three novel algorithms for learning a phonological class system from the distributional statistics of a language, all of which are based on partitioning graph representations of phone distributions. Second, I propose a new method for fitting Maximum Entropy phonotactic grammars, MaxEntGrams, which offers theoretical complexity improvements over the widely-adopted approach taken by Hayes and Wilson [2008]. Third, I present a series of computational experiments which fit MaxEntGram models built on top of learned phonological class systems to English, Polish, and Korean and evaluate the extent to which the resulting grammars predict existing experimental results on sonority projection. The results of these computational experiments suggest that the models with learned class systems predict human-like sonority projection behavior as well as the standard approach using traditional linguistic feature specification in both English and Korean, and better than the traditional approach in Polish. This success is attributed, in part, to the fact that the combination of phonological class learning and MaxEntGrams eliminates the need for constraint-induction heuristics. All together, none of the tested cases provide evidence that phonotactic models built using traditional, substantive linguistic feature systems predict human behavior better than models that make use of distributionally-defined phone representations.
Type
dissertation
Date
2022-09
Publisher
License
License
http://creativecommons.org/licenses/by/4.0/