Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier


Open Access Dissertation

Document Type


Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded


Month Degree Awarded


First Advisor

Andrew McCallum

Subject Categories

Artificial Intelligence and Robotics


Making complex decisions in areas like science, government policy, finance, and clinical treatments all require integrating and reasoning over disparate data sources. While some decisions can be made from a single source of information, others require considering multiple pieces of evidence and how they relate to one another. Knowledge graphs (KGs) provide a natural approach for addressing this type of problem: they can serve as long-term stores of abstracted knowledge organized around concepts and their relationships, and can be populated from heterogeneous sources including databases and text. KGs can facilitate higher level reasoning, influence the interpretation of new data, and serve as a scaffolding for knowledge that enhances the acquisition of new information. A symbolic graph over a fixed, human-defined schema encoding facts about entities and their relations is the predominant method for representing knowledge, but this approach is brittle, lacks specificity, and is inevitably highly incomplete. On the other extreme, recent work on purely text-based knowledge models lack abstractions necessary for complex reasoning. In this thesis I will present work incorporating neural models, rich structured ontologies, and unstructured raw text for representing knowledge. I will first discuss my work enhancing universal schema, a method for learning a latent schema over both existing structured resources and unstructured free text, embedding them jointly within a shared semantic space. Next, I inject additional hierarchical structure into the embedding space of concepts, resulting in more efficient statistical sharing among related concepts and improved accuracy in both fine-grained entity typing and linking. I then present initial work representing knowledge in context, including a single model for extracting all entities and long-range relations simultaneously over full paragraphs while jointly linking these entities to a KG. I will conclude by discussing possible future directions for representing knowledge in context.