Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier

https://orcid.org/0000-0001-7193-9670

Document Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded

2020

Month Degree Awarded

May

First Advisor

Benjamin M. Marlin

Subject Categories

Artificial Intelligence and Robotics

Abstract

Irregularly-sampled time series are characterized by non-uniform time intervals between successive measurements. Such time series naturally occur in application areas including climate science, ecology, biology, and medicine. Irregular sampling poses a great challenge for modeling this type of data as there can be substantial uncertainty about the values of the underlying temporal processes. Moreover, different time series are not necessarily synchronized or of the same length, which makes it difficult to deal with using standard machine learning methods that assume fixed-dimensional data spaces.

The goal of this thesis is to develop scalable probabilistic tools for modeling a large collection of irregularly-sampled time series defined over a common time interval. We first introduce an uncertainty-aware kernel framework based on a Gaussian process (GP) representation of the time series and then demonstrate how to significantly scale up the model by linearizing the kernel with various acceleration techniques.

To further reduce the computational overhead of the GP representation and improve the expressiveness of the model, we propose a generalized uncertainty-aware framework that integrates a posterior GP sampler with arbitrary black-box models including neural networks. We propose a linear time and linear space sampling algorithm and show how to efficiently train the entire framework end-to-end.

To better model the uncertainty by utilizing the information from an entire dataset collectively, we reframe our task as a missing data problem that aims at learning the distribution of the latent temporal process. We first study the missing data problem under a simplified setting where the data are defined on a finite-dimensional space and introduce a model based on generative adversarial networks for learning from incomplete data. To relax the finite-dimensional constraint, we propose a unified encoder-decoder framework that can be trained as a density model or an implicit generative model. We finally introduce a specific architecture within this framework to efficiently represent and learn from irregularly-sampled continuous time series.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS