Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.
Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.
Author ORCID Identifier
https://orcid.org/0000-0002-7182-4002
AccessType
Open Access Dissertation
Document Type
dissertation
Degree Name
Doctor of Philosophy (PhD)
Degree Program
Linguistics
Year Degree Awarded
2022
Month Degree Awarded
September
First Advisor
Joe Pater
Second Advisor
Gaja Jarosz
Third Advisor
Cameron Musco
Subject Categories
Computational Linguistics | Phonetics and Phonology
Abstract
This dissertation explores the possibility that the phonological grammar manipulates phone representations based on learned distributional class memberships rather than those based on substantive linguistic features. In doing so, this work makes three primary contributions. First, I propose three novel algorithms for learning a phonological class system from the distributional statistics of a language, all of which are based on partitioning graph representations of phone distributions. Second, I propose a new method for fitting Maximum Entropy phonotactic grammars, MaxEntGrams, which offers theoretical complexity improvements over the widely-adopted approach taken by Hayes and Wilson [2008]. Third, I present a series of computational experiments which fit MaxEntGram models built on top of learned phonological class systems to English, Polish, and Korean and evaluate the extent to which the resulting grammars predict existing experimental results on sonority projection. The results of these computational experiments suggest that the models with learned class systems predict human-like sonority projection behavior as well as the standard approach using traditional linguistic feature specification in both English and Korean, and better than the traditional approach in Polish. This success is attributed, in part, to the fact that the combination of phonological class learning and MaxEntGrams eliminates the need for constraint-induction heuristics. All together, none of the tested cases provide evidence that phonotactic models built using traditional, substantive linguistic feature systems predict human behavior better than models that make use of distributionally-defined phone representations.
DOI
https://doi.org/10.7275/30683304
Recommended Citation
Nelson, Max A., "Phonotactic Learning with Distributional Representations" (2022). Doctoral Dissertations. 2703.
https://doi.org/10.7275/30683304
https://scholarworks.umass.edu/dissertations_2/2703
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.