Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.
Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.
Open Access Dissertation
Doctor of Philosophy (PhD)
Year Degree Awarded
Month Degree Awarded
Artificial Intelligence and Robotics
Various real-world applications involve directly dealing with aggregate data. In this work, we study Learning with Aggregate Data from several perspectives and try to address their combinatorial challenges.
At first, we study the problem of learning in Collective Graphical Models (CGMs), where only noisy aggregate observations are available. Inference in CGMs is NP- hard and we proposed an approximate inference algorithm. By solving the inference problems, we are empowered to build large-scale bird migration models, and models for human mobility under the differential privacy setting.
Secondly, we consider problems given bags of instances and bag-level aggregate supervisions. Specifically, we study the US presidential election and try to build a model to understand the voting preferences of either individuals or demographic groups. The data consists of characteristic individuals from the US Census as well as
voting tallies for each voting precinct. We proposed a fully probabilistic Learning with Label Proportions (LLPs) model with exact inference to build an instance-level model.
Thirdly, we study distribution regression. It has similar problem setting to LLPs but builds bag-level models. We experimentally evaluated different algorithms on three tasks, and identified key factors in problem settings that impact the choice of algorithm.
SUN, TAO, "Learning with Aggregate Data" (2019). Doctoral Dissertations. 1483.