Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

A mixed-initiative planning approach to exploratory data analysis

Robert A St. Amant, University of Massachusetts Amherst

Abstract

Exploratory data analysis (EDA) has come to play an increasingly important role in statistical analysis. Modern computer-based statistics packages contain a rich set of operations, suitable for almost any EDA application. One can fit lines and higher order functions to relationships, identify and describe clusters, transform and reduce data to meet the specific requirements of a domain, among many other possibilities, in seeking to understand patterns in data. Unfortunately, EDA can be difficult. Conventional statistics packages offer the user hundreds of operations, which must often be combined in lengthy sequences to produce useful results. In addition, the application of these operations often depends on the user's knowledge of what the data mean. In other words, EDA is too large a problem for a human analyst to solve alone, but complete automation of the process is not feasible either because domain-specific knowledge is required. This dissertation describes an assistant for intelligent data exploration called A scIDE. A scIDE is mixed-initiative, autonomously pursuing its own goals, but always allowing the user to review and possibly override its decisions. A scIDE's design as a knowledge-based planning system allows it to detect and evaluate suggestive features in the data, identify appropriate strategies for extracting the patterns, apply the strategies incrementally, and combine the results in a coherent whole. An experimental evaluation compared the performance of human subjects analyzing data with and without A scIDE's assistance. Although the subjects worked with A scIDE for only a couple of hours, each, it clearly influenced the efficiency and coherence of their explorations. Analysis of the experimental results turned up suggestive evidence that A scIDE facilitates data analysis primarily by helping users navigate through the space of relations among variables. This research provides a novel look at automated support for data analysis. Conventional systems tend to take over the task completely, or rely on the user for every step of the analysis. A scIDE's mixed-initiative planning approach provides an alternative in which control changes hands flexibly between the user and the system. This arrangement capitalizes on the strengths of both: the system takes over low-level search and statistical computations, while the user remains responsible for strategic, knowledgeable guidance of the process.

Subject Area

Computer science|Statistics

Recommended Citation

St. Amant, Robert A, "A mixed-initiative planning approach to exploratory data analysis" (1996). Doctoral Dissertations Available from Proquest. AAI9709657.
https://scholarworks.umass.edu/dissertations/AAI9709657

Share

COinS