A Coarse-Grain Parallel Implementation of the Block Tridiagonal Divide and Conquer Algorithm for Symmetric Eigenproblems.
Date of Award
Master of Science
Robert C. Ward
Michael W. Berry, Jian Huang
Cuppen’s divide and conquer technique for symmetric tridiagonal eigenproblems, along with Gu and Eisenstat’s modification for improvement of the eigenvector computation, has yielded a stable, efficient, and widely-used algorithm. This algorithm has now been extended to a larger class of matrices, namely symmetric block tridiagonal eigenproblems. The Block Tridiagonal Divide and Conquer algorithm has shown several characteristics that make it suitable for a number of applications, such as the Self-Consistent-Field procedure in quantum chemistry.
This thesis discusses the steps taken to implement a coarse-grain parallel version of the Block Tridiagonal Divide and Conquer algorithm, suitable for a parallel supercomputer or a cluster of machines. The parallel version relies on components of the ScaLAPACK parallel linear algebra library and follows the same model as the serial code, including the implementation of full deflation.
A modest speedup (2 to 3) was achieved using a few processors (4 and 16). Increasing the number of processors from 4 to 16 produced only slightly better speedup. This implementation was not competitive with the standard ScaLAPACK symmetric eigenvalue routine. Analysis shows that the distribution scheme chosen for the eigenvector storage requires n x O(p)2 function calls to the ScaLAPACK matrix multiplication routine, where n is the matrix size and p is the number of blocks. The matrix multiplications are responsible for the majority of the computational cost; therefore, the associated overhead needs to be reduced in order to make this implementation more competitive.
Day, Robert M., "A Coarse-Grain Parallel Implementation of the Block Tridiagonal Divide and Conquer Algorithm for Symmetric Eigenproblems.. " Master's Thesis, University of Tennessee, 2003.