Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Networking issues in distributed real -time systems

Vijaya Ramaraju Lakamraju, University of Massachusetts Amherst

Abstract

Networking involves every aspect in the design of the network infrastructure from the selection/synthesis of the interconnection topology to what communication protocols it should use and how it should be deployed and maintained. A large body of literature is available on these issues. We attempt to further increase this body of literature by looking at two specific issues: the synthesis of networks that satisfy multiple properties and the design of fault tolerant communication services for high-speed networks. Synthesizing networks that satisfy multiple requirements, such as high reliability, low diameter, good embeddability etc., is a difficult problem to which there has been no completely satisfactory solution. Our approach to the problem involves a simple filtration process that takes as input a large number of randomly generated graphs. By using multiple filters, one for each requirement and arranging them such that one feeds the other, the final output consists of a short-list of networks that the designer can choose from. Our experimental results show that this approach is both practical and powerful. Perhaps our biggest achievement here is that we show how this seemingly simple approach can generate networks that are serious competitors to several traditional well-known networks. We further highlight the practical applicability of these networks by considering how they can be effectively used in a packaging environment. The interconnection network can have a dominant effect on the reliability of a distributed system. While existing network softwares have been optimized for performance, they have not been able to deal with network failures effectively. We have developed a light-weight fault detection and recovery technique that provides coverage for almost all network interface failures. The detection is based on software watchdog timers and the recovery is based on delta-logging. We have implemented the schemes as a fault tolerance layer over Myrinet, a commercially available networking technology. The implementation showed that a fault detection time of 1 ms and a complete recovery time of around 0.5 second can be achieved with a performance impact of less than 10%. The effectiveness of our fault tolerance schemes was evaluated using a versatile performance and recovery analysis tool called RAPIDS.

Subject Area

Electrical engineering|Computer science

Recommended Citation

Lakamraju, Vijaya Ramaraju, "Networking issues in distributed real -time systems" (2002). Doctoral Dissertations Available from Proquest. AAI3056251.
https://scholarworks.umass.edu/dissertations/AAI3056251

Share

COinS