Static redundancy allocation is inappropriate in hard realtime systems that operate in variable and dynamic environments, (e.g., radar tracking, avionics). Adaptive Fault Tolerance (AFT) can assure adequate reliability of critical modules, under temporal and resources constraints, by allocating just as much redundancy to less critical modules as can be afforded, thus gracefully reducing their resource requirement. In this paper, we propose a mechanism for supporting adaptive fault tolerance in a real-time system. Adaptation is achieved by choosing a suitable redundancy strategy for a dynamically arriving computation to assure required reliability and to maximize the potential for fault tolerance while ensuring that deadlines are met. The proposed approach is evaluated using a real-life workload simulating radar tracking software in AWACS early warning aircraft. The results demonstrate that our technique outperforms static fault tolerance strategies in terms of tasks meeting their timing constraints. Further, we show that the gain in this timing-centric performance metric does not reduce the fault tolerance of the executing tasks below a predefined minimum level. Overall, the evaluation indicates that the proposed ideas result in a system that dynamically provides QOS guarantees along the fault-tolerance dimension.
González, Oscar; Shrikumar, H.; Stankovic, John A.; and Ramamritham, Krithi, "Adaptive Fault Tolerance and Graceful Degradation Under Dynamic Hard Real-time Scheduling" (1997). Computer Science Department Faculty Publication Series. 188.
Retrieved from https://scholarworks.umass.edu/cs_faculty_pubs/188