Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier


Open Access Dissertation

Document Type


Degree Name

Doctor of Philosophy (PhD)

Degree Program

Electrical and Computer Engineering

Year Degree Awarded


Month Degree Awarded


First Advisor

Lixin Gao

Subject Categories

Databases and Information Systems | Data Science | Theory and Algorithms


Data analytics is to analyze raw data and mine insights, trends, and patterns from them. Due to the dramatic increase in data volume and size in recent years with the development of big data and cloud storage, big data analytics algorithms and techniques have been faced with more challenges. Moreover, there are various types of data formats, such as relational databases, text data, audio data, and image/video data. It is challenging to generate a unified framework or algorithm for data analytics on various data formats. Different data formats still need refined and scalable algorithms. In this dissertation, we explore three types of data formats, relational databases, graph databases, and video data for scalable processing. First, with the increase in the big volume of data, business organizations, governments, and other institutes need to generate insight from the data. The relationships in the data matter more than just the individual data points. In order to leverage data relationships and easier scaling, organizations use the graph database to store the relationship information as a first-class entity. We analyze the large network management databases from Cisco, and propose a comprehensive algorithm to transfer structured data formatted in relational databases to graph data organized in a knowledge graph database. We model this problem as the field and record matching in the relational database. We then develop a matching technique for large network management databases deploying instance-level matching for effective data matching. Second, we explore the graph analytics problem for the fundamental component--graph queries that answer user queries efficiently and effectively. State-of-the-art graph query approaches mainly focus on the inference of node labels and neighborhood structures through path propagation. However, entity links in the real world also contain rich hierarchical inheritance relations. For example, the vulnerability of a product version is likely to be inherited from its older version. Taking advantage of the hierarchical inheritances can potentially improve the quality of query results. We take into account this useful dimension, hierarchical inheritance relations, to improve the state-of-the-art graph query approaches. Third, we investigate the video data from video streams for video analytics. Video analytics is to detect and track objects from video streams with many applications in traffic control, security monitoring, event analysis, etc. It involves selecting the best configuration of frame rate and resolution to achieve a certain accuracy in real-time. State-of-the-art switching approaches adjust configurations by profiling video clips on a large configuration space with much compute resources. We propose an approach that adapts the configuration by analyzing past video analytics results instead of profiling candidate configurations. Our approach adopts a lower/higher resolution or frame rate when objects move slow/fast. We train a model that automatically selects the best configuration. We evaluate our approach on two real-world video analytics applications: traffic tracking and pose estimation, and obtain superior performance compared to the state-of-the-art switching methods. Finally, more and more video analytics as a service in clouds with private data draws researchers' attention to privacy concerns. Fully homomorphic encryption (FHE) is one of the promising ways to achieve the privacy-preserving that is a hot topic in the academic and industry communities. Recent work on privacy-preserving deep learning has successfully explored the feasibility of image classification with up to 20 deep neural network layers. In our final work, we explore the feasibility of FHE on encrypted frames for the video analytics pose estimation application with more deep neural network layers. We develop a privacy-preserving pose estimation system based on the FHE SEAL library on a CPU server. It demonstrates the potential and feasibility of privacy-preserving video analytics.


Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Available for download on Sunday, November 13, 2022