Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier


Open Access Dissertation

Document Type


Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded


Month Degree Awarded


First Advisor

Andrew McCallum

Subject Categories

Artificial Intelligence and Robotics | Data Science


Commonsense knowledge is critical to achieving artificial general intelligence. This shared common background knowledge is implicit in all human communication, facilitating efficient information exchange and understanding. But commonsense research is hampered by its immense quantity of knowledge because an explicit categorization is impossible. Furthermore, a plumber could repair a sink in a kitchen or a bathroom, indicating that common sense reveals a probable assumption rather than a definitive answer. To align with these properties of commonsense fundamentally, we want to not only model but also evaluate such knowledge human-like using abstractions and probabilistic principles. Traditional combinatorial probabilistic models, e.g., probabilistic graphical model approaches, have limitations to modeling large-scale probability distributions containing thousands or even millions of commonsensical events. On the other hand, although embedding-based representation learning has the advantage of generalizing to large combinations of events, they suffer from producing consistent probabilities under different styles of queries. Combining benefits from both sides, we introduce probabilistic box embeddings, which represent joint probability distributions on a learned latent space of geometric embeddings. By using box embeddings, it is now possible to handle queries with intersections, unions, and negations in a way similar to Venn diagram reasoning, which has faced difficulty even when using large language models. Meanwhile, existing evaluations do not reflect the probabilistic nature of commonsense knowledge. The popular multiple-choice evaluation style often misleads us into the paradigm that commonsense solved. To fill in the gap, we propose a method of retrieving commonsense related question answer distributions from human annotators as well as a novel method of generative evaluation. We utilize these approaches in two new commonsense datasets. Finally, we draw a connection between the-state-of-art NLP models --- large language models and their ability to perform commonsense reasoning tasks. According to the previous study, large language models would make inconsistent predictions while given different input texts for plausible commonsense situations. We intend to evaluate their performance using more rigorous probabilistic measurements.


Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.