Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Responsible Recklessness: Why High Performance Computing Must Embrace Randomized Numerical Linear Algebra
Details

Responsible Recklessness: Why High Performance Computing Must Embrace Randomized Numerical Linear Algebra

Date Issued
December 1, 2025
Author(s)
Melnichenko, Max
Advisor(s)
Jack Dongarra
Additional Advisor(s)
Jack Dongarra, Piotr Luszczek, Michael Berry, Hartwig Anzt, James Demmel, Michael Mahoney
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/22074
Abstract

Randomized Numerical Linear Algebra (RandNLA) is a relatively young branch of the field of Numerical Linear Algebra (NLA). It leverages randomization as the main approach to develop algorithms for computing solutions to classical linear algebra problems with performance superior to the deterministic NLA schemes. Despite that, RandNLA methods have yet to gain broad adoption within mainstream NLA software libraries, and, by extension, within the High-Performance Computing (HPC) toolkit aimed at combating the challenges posed by the end of Moore’s law. This dissertation argues that the HPC community is missing a powerful opportunity by not embracing RandNLA methods. As part of this work, I propose and analyze three RandNLA methods for the two NLA problem classes: full matrix factorizations through QR decomposition with column pivoting (QRCP), and low-rank factorization via Singular Value Decomposition (SVD). The core contribution of the projects I present lies in my development of high-performance implementations for each method, integrated into the open-source RandLAPACK library. First, I present a novel QRCP algorithm for tall matrices. I provide a comprehensive analysis of the method in exact arithmetic and give a broad characterization of its finite-precision arithmetic analysis. Experiments on the algorithm’s implementation demonstrate order-of-magnitude speedups over LAPACK’s standard function for QRCP. Next, I present an algorithmic framework that enables the design of practical QRCP algorithms for general matrices through user-controlled choices of the core subroutines. I provide a comprehensive overview of how to navigate these choices on modern hardware platforms, offering detailed descriptions of alternative methods for both CPUs and GPUs. CPU experiments demonstrate that the proposed method achieves performance improvements of up to two orders of magnitude over LAPACK’s standard QRCP routine. GPU experiments suggest that my method attains approximately 65% of the performance of cuSOLVER’s unpivoted QR factorization. Finally, I discuss a high-performance implementation of a partial SVD algorithm based on the Krylov subspace method. The algorithm is tested against randomized SVD and a deterministic SVD based on the implicitly restarted Lanczos method. Results on a dense synthetic dataset and a real-world sparse model reduction problem show the superiority of the algorithm I implemented.

Subjects

RandNLA

HPC

Partial SVD

QRCP

Krylov Subspace Metho...

GPU

Disciplines
Computer Sciences
Numerical Analysis and Scientific Computing
Other Computer Sciences
Software Engineering
Degree
Doctor of Philosophy
Major
Computer Science
Comments

Keywords: Randomized Numerical Linear Algebra, RandNLA, High-Performance Computing, Partial SVD, Krylov Subspace Methods, QRCP, RandLAPACK, GPU.

File(s)
Thumbnail Image
Name

Melnichenko_PhD_Dissertation.pdf

Size

34.93 MB

Format

Adobe PDF

Checksum (MD5)

99e189ba2176daa85466ee792c826b8f

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify