Masters Theses
Date of Award
5-1997
Degree Type
Thesis
Degree Name
Master of Science
Major
Computer Science
Major Professor
James S. Plank
Committee Members
Brad Vader Zanden
Abstract
As the choice of parallel platforms shifts from dedicated parallel machines to networks of workstations, the need for program fault-tolerance has never been greater. Checkpointing is the only means to provide programs with fault-tolerance in general-purpose computing environments. Checkpointing usually involves saving program states to disk. However, in parallel environments, stable storage becomes a bottleneck that prevents efficient checkpointing. Presented in this thesis are algorithms to provide parallel programs with fault-tolerance without relying on stable storage. An implementation of these algorithms was created and compared with the traditional disk-based algorithms. Results show that diskless checkpointing is a viable option to provide efficient fault-tolerance with low overhead.
Recommended Citation
Puening, Michael A., "Diskless checkpointing. " Master's Thesis, University of Tennessee, 1997.
https://trace.tennessee.edu/utk_gradthes/10681