Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier

https://orcid.org/0000-0001-9217-2130

AccessType

Open Access Dissertation

Document Type

dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Linguistics

Year Degree Awarded

2021

Month Degree Awarded

May

First Advisor

Gaja Jarosz

Second Advisor

Joe Pater

Subject Categories

Computational Linguistics | Phonetics and Phonology

Abstract

This dissertation tests sequence-to-sequence neural networks to see whether they can simulate human phonological learning and generalization in a number of artificial language experiments. These experiments and simulations are organized into three chapters: one on opaque interactions, one on computational complexity in phonology, and one on reduplication. The first chapter focuses on two biases involving interactions that have been proposed in the past: a bias for transparent patterns and a bias for patterns that maximally utilize all of the processes in a language. The second chapter looks at harmony patterns of varying complexity to see whether both Formal Language Theory and the sequence‑to‑sequence network correctly predict which kinds of patterns humans most easily learn. Finally, the third chapter investigates reduplication—a pattern that involves copying all or part of a word. These simulations focus on the model’s ability to generalize reduplication to novel words and compares these results to past experiments involving humans. The conclusions drawn from these three chapters suggest that the kind of language‑specific representations and explicit biases used in past models are not necessary to capture the behavior of humans in the relevant experiments. Instead, the ability of the network to capture these various behaviors is attributed to two characteristics of its architecture: its recurrent connections, which provide a limited memory through time, and the fact that it is made up of two separate mechanisms (an encoder and a decoder) which require forms to be processed into an intermediate representation.

DOI

https://doi.org/10.7275/jchq-cm44

Share

COinS