Masters Theses
Date of Award
8-1997
Degree Type
Thesis
Degree Name
Master of Arts
Major
Political Science
Major Professor
Michael Rhodes Fitzgerald
Committee Members
David Feldman, Michael Gant
Abstract
Checkpointing is a functionality that enables users of distributed systems to perform job swapping, process migration and fault-tolerance. While checkpointers typically provide job swapping and process migration with reasonable overhead, the overhead for fault-tolerance is often too high. The reason for this is not inherent in the act of checkpointing, but instead stems from how the checkpoints are placed on stable storage.
This thesis explores two placement strategies for checkpointing in distributed systems. These are called Single Processor Fault Tolerance, and Reed-Solomon coding. Both strategies are adaptations of RAID techniques [16, 41] for check- pointing systems, and aim to improve performance at the expense of fault cover- age. We detail an implementation of these strategies in MIST, a checkpointer for PVM, and present performance results of these and standard checkpoint place- ment strategies. The conclusions that we draw are that both strategies can im- prove the performance of checkpointing, and should be employed by users who desire improved performance over wholesale failure coverage.
Recommended Citation
Parham, Georgiana Paige, "Technological policy implementation : the politics of expertise and responsible governance. " Master's Thesis, University of Tennessee, 1997.
https://trace.tennessee.edu/utk_gradthes/10674