Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Accepting the inevitable: The role of failure recovery in the design of planners

Adele E Howe, University of Massachusetts Amherst

Abstract

Failures are inevitable for many types of computer systems. Designers who seek to limit the frequency and impact of failures can adopt two basic approaches: automated failure recovery and debugging failures. This dissertation describes how automated failure recovery can both repair failures and expedite debugging of an autonomous planner, the Phoenix planner, in a dynamic environment. The automated failure recovery component applies general methods to recover from failures detected by the Phoenix planner. The design of the failure recovery component was developed by applying a methodology of first constructing a model of expected cost of failure recovery and then evaluating whether changes in the design of failure recovery result in improvements in the evaluation of expected cost. Additionally, the model was used to derive a control strategy for best selecting from a set of recovery methods. The expected cost model, its assumptions and the failure recovery component were tested in a set of three experiments. The first determined performance baselines for failure recovery, the second evaluated the performance of the control strategy, and the third compared the performance of an initial set of recovery methods to a new set augmented to lower the expected cost of recovery. As predicted, the control strategy and the augmented set of recovery methods improved performance by reducing the overall cost of recovery and increasing the overall recovery rate. Failure recovery analysis (FRA) is a procedure for analyzing execution traces of failure recovery to discover how the planner's actions might be causing failures. The procedure involves statistically analyzing execution traces for dependencies between actions and failures, mapping those dependencies to plan structures, and explaining how the structures might produce the observed dependencies. Failure Recovery Analysis is a partially automated procedure that can be applied by designers to identify cases in which the plan library may be causing its own failures (i.e., bugs) and to implement and evaluate modifications to the plan library intended to eliminate the bugs. FRA is demonstrated by analyzing some of the data from the three experiments testing failure recovery. This research shows how failure recovery can help a planner repair failures due to both unexpected events and plan bugs and help planner designers determine how plan failures depend on the planner's actions and evaluate design changes over time. The primary contributions of the thesis are: an empirical methodology for designing and evaluating failure recovery in planning and a tool for debugging a plan library.

Subject Area

Computer science|Artificial intelligence

Recommended Citation

Howe, Adele E, "Accepting the inevitable: The role of failure recovery in the design of planners" (1993). Doctoral Dissertations Available from Proquest. AAI9316663.
https://scholarworks.umass.edu/dissertations/AAI9316663

Share

COinS