Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Control models of natural language parsing

Thomas F Kalt, University of Massachusetts Amherst

Abstract

Most recent statistical parsers fall into one of two groups. The largest group consists of parsers which are based on some variation of a probabilistic context-free grammar, use joint probability models, and use tabular methods to find the most probable parse. Parsers in the second group are based on probabilistic push-down automata, use conditional probability models, and use some form of state-space search to find the most probable parse. This thesis is a study of natural language parsing as a control problem. This view leads to parsers of the second type. We show that search can be done very efficiently for such parsers. The control approach leads to a particular interpretation of the history-based parsing tradition, in which history is equated with state. The corresponding probability model is called a Markov parsing model, which can be used both for syntactic disambiguation and for search. The resulting parsers are simple, fast, have excellent coverage, and are reasonably accurate. Using treebanks (collections of text, which are expert-annotated with syntactic structure), we learn controllers for parsers that can be applied with little or no search. We call these greedy or nearly-greedy policies. Thus we are studying parsers which are constrained to operate efficiently.

Subject Area

Computer science

Recommended Citation

Kalt, Thomas F, "Control models of natural language parsing" (2005). Doctoral Dissertations Available from Proquest. AAI3193914.
https://scholarworks.umass.edu/dissertations/AAI3193914

Share

COinS